Cluster computing

Monday, September 14, 2020

Metrics

We were discussing a set of features for stream store that brings the notion of accessing events in sorted order with skipped traversal. The events can be considered to be in some predetermined sequence in the event stream whether it is by offset or by timestamp. These sequence numbers are in sorted order. Accessing any event in the stream, as if considered to be in a batch bounded by a head and a tail StreamCut that occur immediately before and after the event respectively, is now better than the linear traversal to read the event. This makes the access to the event from the historical set of events in the stream to be O(log N). The skip-level access links in the form of head and tail streamcuts can easily be built into the metadata on a catch-up basis after the events are accrued in the stream.

In addition, we have the opportunity to collect fields and the possible values that occur in the events to allow them to be leveraged in queries subsequently. This enhancement of metadata from events in the stream becomes useful to find similar events

The use of standard query operators with the events in the stream has been made possible by the Flink programming library but the logic written with those operators usually is not aware of all the fields that has been extracted from the events. By closing the gap between the field extraction and new fields in query logic, the applications can not only improve existing logic but also write new ones.

The extraction of fields and their values provides an opportunity to not only discover the range of values that certain keys can take across all the events in the stream but also their distribution. Events are numerous and there is no go-to source for statistics about events especially similar looking events. Two streams having similar events may have them in incredibly different order and arrival times. If the stream store is unaware of the contents of the events, it can tell the number of events and the size made up by those events. But with some insight into the events such as the information about their source, a whole new set of metrics are now available which can help with summary information, point of origin troubleshooting, contribution/spread calculation, and better resource allocations.

As with many metrics, there is a sliding window for timestamp based datapoint collections for the same metric. Although metrics support flexible naming convention, prefixes and paths, the same metric may have a lot of datapoints over time. A sliding window presents a limited range that can help with aggregation functions such as latest, maximum, minimum and count over this range for subsequent trend analysis. Queries on metrics is facilitated with the help of annotations on the metric data and pre-defined metadata. These queries can use any language but the queries using search operators are preferred for their similarity to shell based execution environments. In this way metrics provide handy information about the stream that would otherwise have to be interpreted by running offline analysis on logs. Summary statistics from metrics can now be saved with metadata.

Having described the collection of metrics from streams with primitive information on events, let us now see how to boost the variety and customization of metrics. In this regard, the field extraction performed from the parsing of historical events, provides additional datapoints which become helpful in generating metrics. For example, the metrics can now be based on field names and their values as they occur throughout the stream, giving information on the priority and severity about certain events. This summary information about the stream now includes metrics about criteria pertaining to the events.

Sunday, September 13, 2020

Field property Extraction:

We were discussing a feature for stream store that brings the notion of accessing events in sorted order with skipped traversal. The events can be considered to be in some predetermined sequence in the event stream whether it is by offset or by timestamp. These sequence numbers are in sorted order. Accessing any event in the stream, as if considered to be in a batch bounded by a head and a tail StreamCut that occur immediately before and after the event respectively, is now better than the linear traversal to read the event. This makes the access to the event from the historical set of events in the stream to be O(log N). The skip-level access links in the form of head and tail streamcuts can easily be built into the metadata on a catch-up basis after the events are accrued in the stream.

This says nothing about the contents of the events and has little or no possibility of data corruption since it is entirely a metadata activity. If we do consider the data to have some text, then we could leverage that to enhance the metadata.

The data that gets written to an event in the stream is not parsed because it may contain binary data and the stream store does not know what may be useful to customers. But that decision could be pended for later by extracting all key value properties that occurs in patterns in the text portion of the data.

Many devices send IoT data in the form of json or Xml if not regular text data besides binary data. This gives the opportunity to collect fields and the possible values that occur in the events to allow them to be leveraged in queries subsequently. This enhancement of metadata from events in the stream becomes useful to find similar events

The use if standard query operators with the events in the stream has been made possible by the Flink programming library but the logic written with those operators usually is not aware of all the fields that has been extracted from the events. By closing the gap between the field extraction and new fields in query logic, the applications can not only improve existing logic but also write new ones.

The enhancement of metadata for a stream works independent of events as well. For example, an application browsing the streams in scope or a store now has properties to narrow down the stream of interest. This is significantly better than reading all the events in the store.

The activities to build this metadata is incremental and progressive as the events arrive in the stream due to the append only nature of the stream. Even when the stream is truncated, the metadata does not lose its value. These activities can also keep up with the rate of incoming data.

Saturday, September 12, 2020

Sorted events

Sorted Events

This article is a brief glimpse of a feature for a stream store that is known for its limitless continuous storage of device data traffic. Events arrive at the time store sequenced by their timestamps and the stream store guarantees that they will be saved in the same order that they arrive. Other than that, the stream store does not make any assumptions or interpret the contents of the events in a stream. This leads to a design where all events must be iterated sequentially one after the other in the same original order that they were preserved. Queries reading the stream have to start from the beginning of a stream each time they have to repeat a traversal of a sequence of events unless they mark the delimiters with the help of StreamCuts.

There are a few problems we can solve further. First, the timestamp of an event is not a user-friendly field and certainly inconvenient for comparison and sorting. Queries must maintain their own sequence numbers if they would like the ability to sort event references in arrays. Given a set of events, there is no lookup table to see where they are located in a stream. Client applications that write these queries to the store have to build cumbersome logic or delegate it to stream processing packages such as Flink. Even if Flink is used, the iteration cannot be ruled out as the syntax relies on processing event one by one which is still O(N) complexity and N tends to be large in a stream.

Assuming that events are sequenced by numbers and that the client would like to skip events by increasing multiples of two, there are no events handy next link access between events as there would be in a skip-list. The events lose their activity as they become old and are replaced by newer events. A certain portion towards the head of the stream marks the cold portion of the stream where the events cease to be as relevant or meaningful as the events arriving at the tail end. By earmarking retention period, the head is folded towards the warm range of the stream but the data is gone if it was in the truncated segments of the stream. If there were a metadata such as table segment that could hold key values that allows events or even stream ranges to have auxiliary information in terms of skip-level-access, then we make the traversal faster logarithmically.

This kind of metadata for introducing newer and faster semantics to iteration allows access between cold-warm-hot segment ranges of a stream in both forward and backward manner by readjusting the head to the start of the event to be read. The metadata is often isolated and stored with a data structure different from that used for data which implies there is little or no risk of corruption to the original events in the stream. Only a shuffling of StreamCuts is sufficient to quickly jump back and forth between different sequences of events.

Since the events maintain their order in the stream from the time of arrival, the metadata created for the events also tend to be incremental in their changes with the nice side effect that code working on an earlier range does not have to change.

There is already a batch reader that can read a historical segment range between a head and a tail streamcut of a stream. By providing different head and tail references using the above-mentioned metadata, it is now possible to access the events anachronistically and with fewer reads.

Friday, September 11, 2020

Distributed stream stores continued

A federation does not mandate a P2P network but it does share principles of top heavy architecture because it brings consistency across all members.

Chaining algorithms:

Although chaining refers to something in continuity like in the chain of trust for certificates or the chain of custody as in the case of a pipeline of devOps activities, this article introduces the notion of chaining in the form of linked stream stores. A set of linked stream stores helps with distributed queries where the stream may be resolved not in the immediate store where the query is received but in a store that can be reached over links and is the true owner of that stream. In that sense, chaining helps with relay of control and data traffic as well as analytical workloads. Certain stores have the ability to include geo replication and provide a content distribution network which allows them to internalize all load-balancing, assignment and fulfilment activities but in a distributed setting, we can take some of those concerns out of stream stores and across an entire collection.

Using simple techniques of stream resolutions by name or classification, participating stores can forward requests until a specific store is found. When the naming convention utilizes the ordering of stream stores, the access becomes logarithmically more efficient than next hops. For example, let us say that stream stores are arranged in contiguous ranges of key space uniformly spread between start and finish. Any new stream store can be inserted into a range and an old one can be removed from this key range without affecting the assignments of other stores.

Given a stream name, let us say a classifier has identified the store to belong to a key range. Now the lookup of that store in the overall key space can follow the same convention as the lookup of a node in a skip-list. The common techniques in such lookups involve the following criteria:

1. Starting at the highest level of skipping, we access a store in the desired range or the one immediately before it.

2. Then dropping down a level, we repeat step 1 until we reach the intended store.

The algorithm above works in all cases because it has an invariant and an incremental progress towards a known termination. Therefore, its correctness is determined.

The arrangement for a skip-list is governed by the following criteria:

1. Each skip list node has a height and a set of neighboring skip list nodes, precisely as many neighboring nodes as its height, some of which may be null.

2. Each skip list node has some piece of data associated with it.

3. Each skip list node has a key to look it up

Skiplist is initialized with the max height of any node. Head and tail are initialized and attached

Insert operation involves the following:

Find where the node goes by comparing keys

If it is a duplicate, don't insert, otherwise perform insert

Use a random number to generate a height

Remove operation involves the following:

Find the node to be deleted, traverse using available max height to lowest till we reach the node

for each of the levels on the current node, update head and tail forward references and delete the node.

With these accesses outlined, the stream stores can be chained in the same way as the nodes in a skip-list for efficient traversal.

Thursday, September 10, 2020

Distributed stream stores continued

A P2P network introduces a network first design where peers are autonomous agents and there is a protocol to enable them to negotiate contracts, transfer data, verify the integrity and availability of remote data and to reward with payments. It provides tools to enable all these interactions. Moreover, it enables the distribution of the storage of a file as shards on this network and these shards can be stored using a distributed hash table. The shards themselves need not be stored in this hash table, rather a distributed network and messaging could facilitate it with location information. Messages are helpful to enforce consistency as nodes come up or go down. For example, a gossip protocol may be used for this purpose and it involves propagating updates via message exchanges. Message exchanges can include state or operation transfers. Both involve the use of vector clocks. In state transfer model, each replica maintains a state version tree which contains all the conflicting updates. When the client sends its vector clock, the replicas will check whether the client state precedes any of its current versions and discard it accordingly. When it receives updates from other replicas via gossip, it will merge the version trees. In operation transfer model, each replica has to first apply all operations corresponding to the cause before those corresponding to the effect. This is necessary to keep the operations in the same sequence on all replicas and is achieved by adding another entry in the vector clock, a V-state, that represents the time of the last updated state. In order that this causal order is maintained, each replica will buffer the update operation until it can be applied to the local state A tuple of two timestamps - one from the client's view and another from the replica's local view is associated with every submitted operation. Since operations are in different stages of processing on different replicas, a replica will not discard the state or operations it has completed until it sees the vector clocks from all others to have preceded it

Wednesday, September 9, 2020

Exceptions encountered when setting tls on server

java.lang.IllegalArgumentException: File does not contain valid certificates: /opt/pravega/conf/client.truststore.jks

at io.netty.handler.ssl.SslContextBuilder.trustManager(SslContextBuilder.java:182)

at io.pravega.client.netty.impl.ConnectionPoolImpl.getSslContext(ConnectionPoolImpl.java:280)

at io.pravega.client.netty.impl.ConnectionPoolImpl.getChannelInitializer(ConnectionPoolImpl.java:237)

at io.pravega.client.netty.impl.ConnectionPoolImpl.establishConnection(ConnectionPoolImpl.java:194)

at io.pravega.client.netty.impl.ConnectionPoolImpl.getClientConnection(ConnectionPoolImpl.java:128)

at io.pravega.client.netty.impl.ConnectionFactoryImpl.establishConnection(ConnectionFactoryImpl.java:62)

at io.pravega.client.netty.impl.RawClient.<init>(RawClient.java:87)

at io.pravega.controller.server.SegmentHelper.updateTableEntries(SegmentHelper.java:403)

at io.pravega.controller.store.stream.PravegaTablesStoreHelper.lambda$addNewEntry$10(PravegaTablesStoreHelper.java:178)

at io.pravega.controller.store.stream.PravegaTablesStoreHelper.lambda$null$54(PravegaTablesStoreHelper.java:534)

at java.util.concurrent.CompletableFuture.uniCompose(CompletableFuture.java:952)

at java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:926)

at java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442)

at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)

at java.util.concurrent.FutureTask.run(FutureTask.java:266)

at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)

at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

at java.lang.Thread.run(Thread.java:748)

Caused by: java.security.cert.CertificateException: found no certificates in input stream

at io.netty.handler.ssl.PemReader.readCertificates(PemReader.java:98)

at io.netty.handler.ssl.PemReader.readCertificates(PemReader.java:64)

at io.netty.handler.ssl.SslContext.toX509Certificates(SslContext.java:1071)

at io.netty.handler.ssl.SslContextBuilder.trustManager(SslContextBuilder.java:180)

... 19 common frames omitted

The configuration options for the kubernetes deployments are somewhat different from the standalone case because Kubernetes recognized x509 and pkcs8 format. A tls.crt and tls.key suffices to provide the necessary configuration to configure the tls server. Since the most convenient way to create the certificate and key pair on Kubernetes is via cert-manager, these will usually be available as a Kubernetes secret which can then be mounted on the volume as say ‘/etc/secret-volume’. The configurations therefore look like:

TLS_ENABLED = "true"

TLS_KEY_FILE = "/etc/secret-volume/tls.key"

TLS_CERT_FILE = "/etc/secret-volume/tls.crt"

TLS_TRUST_STORE = "/etc/secret-volume/tls.crt"

TLS_ENABLED_FOR_SEGMENT_STORE = "true"

REST_KEYSTORE_FILE_PATH = "/opt/pravega/server.keystore.jks"

REST_KEYSTORE_PASSWORD_FILE_PATH = "/opt/pravega/server.keystore.jks.passwd"

Both the controller and the segmentstore make use of configmaps for use with their deployment/statefulset respectively. These configmaps can be displayed to see that the options are correclty set.

The options were set but I need to make changes because there was a mistake. The pravega controller and segment store are already running. What do I do?

The pravega controller and segment store are typical Kubernetes applications. They can be scaled back to zero replicas, then the configuration adjusted and finally the replicas can be scaled up to the desired number. The server logs will confirm whether the new options were picked up.

4) I see exceptions I don’t recognize in the controller logs? How do I overcome them ?

A few more exceptions encountered in trying out the change above include:

1) Caused by: java.lang.IllegalStateException: Expected the service ZKGarbageCollector [FAILED] to be RUNNING, but the service has FAILED

at com.google.common.util.concurrent.AbstractService.checkCurrentState(AbstractService.java:366)

at com.google.common.util.concurrent.AbstractService.awaitRunning(AbstractService.java:302)

at io.pravega.controller.store.stream.PravegaTablesStreamMetadataStore.<init>(PravegaTablesStreamMetadataStore.java:77)

at io.pravega.controller.store.stream.PravegaTablesStreamMetadataStore.<init>(PravegaTablesStreamMetadataStore.java:67)

at io.pravega.controller.store.stream.StreamStoreFactory.createStore(StreamStoreFactory.java:37)

at io.pravega.controller.server.ControllerServiceStarter.startUp(ControllerServiceStarter.java:230)

at com.google.common.util.con

2) Caused by: java.security.cert.CertificateException: found no certificates in input stream

at io.netty.handler.ssl.PemReader.readCertificates(PemReader.java:98)

at io.netty.handler.ssl.PemReader.readCertificates(PemReader.java:64)

at io.netty.handler.ssl.SslContext.toX509Certificates(SslContext.java:1071)

at io.netty.handler.ssl.SslContextBuilder.trustManager(SslContextBuilder.java:180)

3) java.io.IOException: Invalid keystore format

at sun.security.provider.JavaKeyStore.engineLoad(JavaKeyStore.java:658)

at sun.security.provider.JavaKeyStore$JKS.engineLoad(JavaKeyStore.java:56)

at sun.security.provider.KeyStoreDelegator.engineLoad(KeyStoreDelegator.java:224)

at sun.security.provider.JavaKeyStore$DualFormatJKS.engineLoad(JavaKeyStore.java:70)

at java.security.KeyStore.load(KeyStore.java:1445)

at io.pravega.segmentstore.storage.impl.bookkeeper.ZooKeeperServiceRunner.getTrustManager(ZooKeeperServiceRunner.java:220)

at io.pravega.segmentstore.storage.impl.bookkeeper.ZooKeeperServiceRunner.waitForSSLServerUp(ZooKeeperServiceRunner.java:185)

at io.pravega.segmentstore.storage.impl.bookkeeper.ZooKeeperServiceRunner.waitForServerUp(ZooKeeperServiceRunner.java:164)

at io.pravega.segmentstore.storage.impl.bookkeeper.ZooKeeperServiceRunner.start(ZooKeeperServiceRunner.java:109)

at io.pravega.local.InProcPravegaCluster.startLocalZK(InProcPravegaCluster.java:210)

at io.pravega.local.InProcPravegaCluster.start(InProcPravegaCluster.java:182)

at io.pravega.local.LocalPravegaEmulator.start(LocalPravegaEmulator.java:153)

at io.pravega.local.LocalPravegaEmulator.main(LocalPravegaEmulator.java:128)

In these cases, look for certificate exceptions. They should have a clear message and directive to fix the exception. The certificate exceptions have a ton of resources online since they are very common. Each one can be individually addressed.

For example, a frequently encountered exception is certificate verification failure as shown below.

DEBUG [2020-09-03 15:04:50.374] [grpc-default-worker-ELG-1-1] i.n.h.s.ReferenceCountedOpenSslContext: verification of certificate failed

java.security.cert.CertificateException: No subject alternative DNS name matching nautilus-pravega-controller.nautilus-pravega.svc.cluster.local found.

at sun.security.util.HostnameChecker.matchDNS(HostnameChecker.java:214)

at sun.security.util.HostnameChecker.match(HostnameChecker.java:96)

at sun.security.ssl.X509TrustManagerImpl.checkIdentity(X509TrustManagerImpl.java:455)

at sun.security.ssl.X509TrustManagerImpl.checkIdentity(X509TrustManagerImpl.java:436)

at sun.security.ssl.X509TrustManagerImpl.checkTrusted(X509TrustManagerImpl.java:252)

at sun.security.ssl.X509TrustManagerImpl.checkServerTrusted(X509TrustManagerImpl.java:136)

at io.netty.handler.ssl.OpenSslTlsv13X509ExtendedTrustManager.checkServerTrusted(OpenSslTlsv13X509ExtendedTrustManager.java:223)

at io.netty.handler.ssl.ReferenceCountedOpenSslClientContext$ExtendedTrustManagerVerifyCallback.verify(ReferenceCountedOpenSslClientContext.java:255)

at io.netty.handler.ssl.ReferenceCountedOpenSslContext$AbstractCertificateVerifier.verify(ReferenceCountedOpenSslContext.java:701)

at io.netty.internal.tcnative.SSL.readFromSSL(Native Method)

at io.netty.handler.ssl.ReferenceCountedOpenSslEngine.readPlaintextData(ReferenceCountedOpenSslEngine.java:596)

at io.netty.handler.ssl.ReferenceCountedOpenSslEngine.unwrap(ReferenceCountedOpenSslEngine.java:1181)

at io.netty.handler.ssl.ReferenceCountedOpenSslEngine.unwrap(ReferenceCountedOpenSslEngine.java:1298)

at io.netty.handler.ssl.SslHandler$SslEngineType$1.unwrap(SslHandler.java:201)

at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1372)

at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1267)

at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1314)

at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:501)

at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:440)

at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:276)

at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)

at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)

at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)

at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)

at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)

at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)

at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)

at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:792)

at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:475)

at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:378)

at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)

at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)

at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)

at java.lang.Thread.run(Thread.java:748)

command terminated with exit code 137

In this case, set the name to what is expected for the controller server and as shown in the certificate specification below. You can also selectively disable verification with:

grpc.ssl_target_name_override=pravega-pravega-controller.default

com.sun.net.ssl.checkRevocation=false

grpc.default_authority=pravega-pravega-controller.default

Finally, remember that certificates and keys can be verified offline before their deployment using standard command line tools as shown below:

openssl rsa -noout -modulus -in tls.key | openssl md5

openssl x509 -noout -modulus -in tls.crt | openssl md5

openssl x509 -x509toreq -in tls.crt -out tls.csr -signkey tls.key

openssl req -text -noout -verify -in tls.csr

curl -i -k --cacert conf/cert.pem -u admin:1111_aaaa https://0.0.0.0:9091/v1/scopes -vvv

Always use pkcs8 format of the tls.key after the cert-manager generates them. For example:

openssl pkcs8 -in tls.key -inform PEM -out tls.8.key-outform PEM -topk8 -nocrypt

Sample cert-manager specification is also included below

apiVersion: certmanager.k8s.io/v1alpha1

kind: Issuer

metadata:

namespace: nautilus-pravega

spec:

selfSigned: {}

---

apiVersion: certmanager.k8s.io/v1alpha1

#apiVersion: certificates.certmanager.k8s.io/v1beta1

kind: Certificate

metadata:

namespace: nautilus-pravega

spec:

secretName: sdp-tls-cert-tls

commonName: 'nautilus-pravega-controller.nautilus-pravega'

dnsNames:

- '*.cluster.local'

- '*.svc.cluster.local'

- pravega-pravega-controller.nautilus-pravega

- pravega-pravega-segmentstore.nautilus-pravega

- pravega-pravega-controller

- pravega-pravega-segmentstore

issuerRef:

Tuesday, September 8, 2020

Distributed Stream Stores (continued...)

Comparision to blockchain

Blockchain Technology has been used to decentralize Security and proves to be great for identity management. A blockchain is a continuously growing list of records called blocks which are linked and secured using cryptography. Since it is resistant to tampering, it becomes an open distributed ledger to record transactions between two parties. In identity management it avoids the use of an authentication server and a password database. Each device or user media is given a private key that is guaranteed to be unique and control from that device such as the click of a button is the equivalent of signing in.

Products like Remme, Civic and Storj have demonstrated applications of blockchain to identity and storage purposes. It’s applicability to federations is also possible because the federation does not mandate a particular technology for any given member.

Civic for instance enables its users to login with their fingerprints. It uses a variety of smart contracts, an indigenous utility token, and new software applications. A Merkel tree is used for attestation.

Storj network was initiated to address scalability and increase decentralization. It is a distributed cloud storage network and removes the notion of a centralized third-party storage provider. The decentralization not only helps mitigate traditional data failures and outages but also supports new workloads such as from blockchain. There is a high degree of privacy for the individual whose transactions are maintained in this ledger. It does not divulge any personally identifiable information and can still prove ownership of entries. The ledger itself is maintained by a community where no one actor can gain enough influence to submit a fraudulent transaction or alter recorded data. Storj network facilitates security, privacy, and data control model. In production storage, peer to peer networks was not popular as instance-centric model because data accrues based on popularity, not on utility. Storj network introduces a challenge-response verification system combined with direct payments. In addition, there is a set of federated nodes that alleviate access and performance concerns. Storj network also brings client-side encryption.

Emerging trends like Blockchain have no precedence for storage standards. So this is an opportunity to create one.