The Review and Comparison between Centralized and Decentralized Digital Identity Systems

,


Introduction
Te advent of the Internet has connected people across the world with efciency, reliability, and consistency that was previously thought to be impossible.Trough the power of personal machines such as laptops and smartphones, people can stay constantly online, receiving messages from friends and emails from websites and blogs.Generally, people defne this period as the "consumer-oriented Internet" [1] where the service subject of the consumption Internet is mainly people.However, due to the rapid growth of millions of online services, the service subject of the Internet extends to machines, devices, and digital objects.Tus, the era of Industrial Internet, along with the concept of the Internet of Tings, is coming.To realize the interconnection between enormous subjects, digital identity plays an indispensable role in this new era.
From the 1980s, the Domain Name System (DNS) provides digital name service for the primitive Internet which is called DAPRA Internet, and it is one of the largest name services in operation today [2].During the same period, the Object Identifer (OID) architecture came out with the purpose of naming any object, concept, or thing with a globally unambiguous persistent name [3].In ten years, Bob Kahn, well known as "Te father of the Internet," proposed the Handle System, which is a distributed information system designed to provide an efcient, extensible, and secured global name service for the networks [4,5].Tese infrastructures all made great impacts on the development of the Industrial Internet, but they rely on a single centralized authority which may cause insecure authenticity and limited scalability issues.Tus, a more robust, persistent, distributed, self-controlled identifer which is known as the decentralized identifer (DID) is put forward (Figure 1) around in 2016 [6].

Domain Name System (DNS).
Generally, the DNS could be understood by two parts: domain name and domain name system resolution (DNS resolution).Te domain name is the basic identifer resource of the Internet.It is composed by strings, such as "Google.com"and "Facebook.com,"and is used to identify digital objects through the Internet, for example, websites, email services, and Internet Protocol (IP) addresses.Te domain name facilitates the user with easy access to service providers without memorizing the complex IP address.On the other hand, DNS resolution is a translation process between the domain name and the IP addresses, and it is a distributed database that records the mapping between the domain name and IP address.

Domain Name.
Te domain name is a variable-depth tree structure, which consists of a root domain, top-level domain, and the following second-level and third-level, etc., [2,7].Usually, the far right string is a top-level domain name, and its left-adjacent string is a second-level domain (Figure 2) name and so on.
Te root domain is located at the top of the entire domain name system, represented by ".," which is usually omitted.According to the type of operating organization, the top-level domains can be divided into (1) generic toplevel domains (gTLD) and ( 2) country code top-level domain (ccTLD).Te gTLDs can be commonly seen in daily Internet activities, such as ".com," ".net," and ".org".Te ccTLDs, however, have implied national attributes, for example, ".CN" stands for "China" and ".UK" stands for "United Kingdom" [8].Take "apple.com.cn" for example, this domain contains a root domain, which is a "." behind the "cn," but is omitted, "cn," which is a top-level domain, "com," which is a second-level domain, and "apple," which is a third-level domain [8].

DNS Resolution.
Te DNS resolution is completed by the DNS resolution server.Te server is the database, which stores the mapping between the domain name and IP address.It realizes the conversion between the two, and it is the gate that the user can access directly to the Internet.Te resolution process can be abstract as follow, see Figure 3.
Te root servers store the records of top-level domains, and they provide global top-level resolution.Te top-level servers store the records of second-level domains registered under the top-level domain.At the same time, the secondlevel servers store the records of subsequent-level domains.Te recursive servers store the caching data collected after the recursive process, and they can give user the fnal resolution results (note: the "records" means the mapping between a domain name and IP address).

DNS Resolution Process.
Te end user enters the domain name of the website into the browser, for example, https://example.com.Te browser will send a query to the recursive server, and the recursive server will frst send this query to the root server, then the top-level server, and fnally the second or third-level server to search the records, see Figure 4 [9].

Object Identifer (OID).
Te Object Identifer (OID) is an identifcation system promoted by ISO/IEC and ITU.Te purpose of OID is to give every single digital object a globally unambiguous persistent name.An object identifer is a tree of nodes where each node is simply a sequence of digits [10], see Figure 5.
Te OID root server is divided into three diferent branches (also known as "arc"): ITU (0) arc, ISO (1) arc, and ITU joint ISO (2) arc.Tere are some designated nodes, which can be called registration authorities (RAs), and RA is responsible for the sub-RA assignment and registration.To be more specifc, in the top-level, there are three RAs, and each of them has the right to assign a new sub-RA in the second-level, and the sub-RA also has the right to assign a new sub-RA in the third-level, and the rest can be done in the same manner.Terefore, the RAs can be categorized into various institutions, such as national government departments, industry associations, and standardization organizations, and based on the various RA, OID can be applied in diferent industries, such as health care, food traceability, fnance, education, and digital credential.In another word, each digit of the node stands for a RA or a specifc sort of object.

Handle System.
Te handle system is a distributed information system designed to provide an efcient, extensible, and secured global name service for use on networks such as the Internet [5].In the handle system, the "handle," which means a unique persistent digital identifer, is used to identify a digital object on the Internet.Te invention of the handle not only realized the identifcation management of digital objects on the Internet, but also meets the need for identifcation services in the development of the Internet of Tings.A brief illustration of the handle system can be seen as follow, see Figure 6.
A handle consists of prefx and sufx, separated by periods ("/"), representing a hierarchy of naming authorities.Te prefx is issued by the DONA foundation, and it is administrated by the multiprimary administrator (MPA), in which there are a total of 10 MPAs around the world [11,12].MPA's information is stored in the global handle registry (GHR), which also stores information of diferent prefxes.In a word, the GHR is a database for MPAs and prefxes that MPAs possessed.Under the prefx is the subprefx.Te subprefx is issued by MPA, and every qualifed entity (both

2
Mobile Information Systems organization or individual) can apply to be an LHS provider with the permission of MPA.Te LHS provider can allocate the integral handle identifer (both prefx and sufx) with their purpose.Take Digital Object Identifer (DOI) as an example.DOIs are widely used in academic, such as journal articles, research reports, and ofcial publications, and they are based on handle system."doi: 10.1000/182" is a handle, which stands for the DOI handbook.Te prefx is "10.1000," and the sufx is "182."Te "10" of the prefx distinguishes this handle from other handles, and the "1000" indicates the registrant.Te "182" identifes the digital object, which in this case is the DOI handbook [13].

Decentralized Identity.
Te concept of decentralized identity has come up with the stimulation of blockchain and distributed ledger technologies, and it was also known as self-sovereign identity (SSI).It describes the digital world where users can take greater control over their digital identities instead of their identities being composed of accounts or identifers that are borrowed from providers.Tis makes the Internet not only a more reliable tool but a more robust platform for creating fruitful digital experiences.Te key element in decentralized identity is the decentralized identifer (DID) and verifable credentials (VC) [14,15].

Te Decentralized Identifer (DID).
Te DID is a new type of globally unique digital identifer associated with a subject and a DID document [14].Te subject, which is normally called "DID subject," refers to the entity that this DID identifes.Te DID subject could be anything, such as people, organization, thing or digital information, so it satisfes the need for industrial Internet and IoT requirements.Te DID document is a set of data describing the DID subject, including the DIDs, the public keys, and the services endpoints relevant to the subject.Te DID points to the DID document, and the DID document contains the information of the DID subject, see Figure 7 [14].
Unlike the other digital identity systems, which have distinct hierarchy frameworks, the DID builds a direct relationship between subjects and blockchains.Initially, anyone or anything with the proper software can generate a DID, and begin using it immediately without requiring the authorization or involvement of any centralized registration authority.Tis is the same process used to create public addresses on the Bitcoin or Ethereum or other popular blockchains.Meanwhile, the DID document records the way how this DID is created and its only controller.Finally, both DID and DID document are stored in the blockchain, which makes the DID self-controlled and Figures 8 and 9 decentralized.
Basically, the DID is a string, and it is randomly generated according to encryption algorithm and software, not dependent on the issuance and authorization of authority.Te DID is pretty much similar to a bitcoin address, but it has more properties as follows [16]:  4 Mobile Information Systems authorities, the kind of system needed for almost every other global identifer systems we use.DID and DID document could be store in blockchain, which play the role as the trust anchor, and it exactly realizes the decentralized feature of this new type identifer.

Verifable Credentials (VC).
Te verifable credentials (VC) are one of the most important elements of SSI, and it is the manifestation of DID.Te W3C verifable credential data model v1.1 illustrates that "Credentials are a part of our daily lives; driver's licenses are used to assert that we are capable of operating a motor vehicle, university degrees can be used to assert our level of education, and governmentissued passports enable us to travel between countries" [17], while verifable credentials are digital credentials that are closely related to DID and provide authentication for a decentralized identity.Verifable Credentials contain 3 main components [17,18]: (i) Metadata: Issued with the issuer's cryptographic signature."describe attributes of the credential, such as the issuer, the expiration date and time, a representative image, a public key to use for verifcation purposes, the revocation method, and so on" [17].(ii) Claims: A declaration that is made on a topic.For instance, the statement that "Alice's date of birth is January 1, 1990." (iii) Proofs: A proof is data about yourself that enables other people to verify the source of the data, check that the data belongs to you, that the data has not been tampered with, and fnally, that the data has not been revoked by the issuer.A proof is also known as an identity document or an identity credential.6 Mobile Information Systems identity, itinerary, and COVID-19 health credentials to over 10,000 airline passengers per month [25].
(i) After 3 years of working, the World Wide Web Consortium (W3C) Decentralized Identifers (DID) 1.0 specifcation was fnally approved by the W3C DID Working Group and it is an offcial recommendation [14].Meanwhile, the W3C Verifable Credentials Working Group is fnalizing the charter for the second generation of its Verifable Credentials specifcation [26].(ii) Te Trust Over IP (ToIP) Foundation has published its frst ofcial specifcations for decentralized governance frameworks [27].

Infuence.
From a technical perspective, DID is a technology that integrates existing cryptography, distributed storage, and various Internet protocols, TCP/IP for example, and it does not involve too many technical difculties.However, SSI is trying to build an ecosystem based on the existing Internet that eliminates monopoly and ensures independent data control with trusted interaction.From a practical perspective, SSI reshapes the way people access Internet services.Specifcally, users who want to access services on the Internet must frst provide personal data to the provider to register an account.After obtaining the account, they can access various Internet services.For example, we need to log in with Google account before visiting Google service.Using DID, users can directly choose whether to use DID to access various Internet services, instead of entrusting suppliers to create accounts for themselves.Tat is, DID hands over the ownership of digital identities such as accounts to the users themselves.It is the "key" in the hands of the user, which can open the door to various Internet services, without the need for a supplier to manage the "key" on behalf of the user.Also, enabling technologies such as IoT, blockchain, and digital twins can collectively contribute to creating more powerful SSI solutions with various advantages.In the IoT felds, the benefts could be summarized as follows [28]: (i) Decentralized Data Processing.With the help of SSI solutions, IoT devices can process and share data directly without relying on a central server, and this improvement could reduce latency and enhance the scalability of the system.(ii) Redundancy and resilience.SSI solution ensures that if one IoT node fails, the network can still function, enhancing the overall reliability and resilience of the IoT ecosystem.(iii) Data Privacy and Security.By decentralized data storage and process, IoT devices can contribute to better data privacy.Personal information can be stored locally on devices, reducing the risk of a single point of failure and potential data breaches.
Te blockchain can beneft SSI solutions in [28]: (i) Decentralized Trust.Blockchain eliminates the need for a central authority by providing a decentralized ledger that is transparent, secure, and tamperresistant.Tis ensures trust among SSI participants without the need for intermediaries.(ii) Smart contracts.Automated, self-executing smart contracts on the blockchain facilitate decentralized agreements and transactions.Tis reduces the need for intermediaries, and increases the speed of generating the DID methods.(iii) Immutable record keeping.Te decentralized and distributed nature of blockchain ensures that once data is recorded, it cannot be altered or deleted.Tis feature enhances the integrity of data, which is crucial for various SSI applications such as supply chain management and healthcare.
In the digital twins felds, the benefts are as follows [29]: (i) Decentralized Simulation.Digital twins can beneft from decentralized simulation environments.Tis allows for more realistic and accurate modeling by distributing the computational load across various nodes.(ii) Collaborative Decision-Making.Decentralized digital twins enable collaborative decision-making by allowing multiple stakeholders to access and contribute to the digital representation of a physical entity.Tis can be valuable in industries such as manufacturing and urban planning.(iii) Real-time Monitoring and Control.Te decentralized nature of digital twins allows for real-time monitoring and control of physical entities.
Changes and updates in one part of the system can be refected across the entire decentralized network, ensuring synchronization.
Moreover, DID is the most important basic resource in the Web3.0 concept.Te EU believes that "Web3.0 is a new decentralized network model where users can own and control their data, while DID provides a decentralized authentication method that allows everyone to control their digital identity information" [30].At present, various applications of Web3, such as decentralized fnance (DeFi), Metaverse, and decentralized applications, all regard DID as an important means of implementation.Terefore, DID and Web3.0 are closely connected.One provides decentralized identity management, and the other provides a framework for user participation and data ownership.Te two are combined to achieve an important component of Web3.0.[31].Secondly, after years of improvement, DNS is a relevant secure system for user.For example, the DNSSEC adds two important features to DNS, which are data origin authentication and data integrity protection.Tese two approaches are used to verify that the requests for a DNS record comes from its authoritative name server and was not spoofed or manipulated in the request process [32,33].However, as the registry control of DNS is under ICANN [34], which means that no other organization will be able to control them, it is undeniable that the DNS is a highly centralized system, with the risks of server breakdown and man-made damage.
3.1.2.OID.Te OID system provides digital identities with features of globally unambiguous persistent digital identifer, distributed management of each layer.Firstly, the OIDs identify and locate objects in various of types, so that all kinds of objects, sorted by industry and application scenarios, can be connected to the Internet, and further integrated the physical and digital world.Secondly, the OIDs are strictly managed by RAs according to the use purpose, so the entire system is well standardized, which facilitate the registration and application process.However, it is also clear that the OID is complex.For example, "1.2.156" and "1.2.36" are two arcs, and the digit behind "156" and "36" can be quite diferent [10].Even the identifers under the same arc could be diferent, saying "1.2.156.1000" and "1.2.156.10000,"without professional knowledge it is impossible to distinguish them.Tus, the long arc implies the process of seeking the digital object should be time-consuming.When dealing with larger amounts of data, the computing process should start from the top to the bottom of the arc, and the computing power is limited by the RAs, which might vary from each other.
3.1.3.Handle.Te handle system has been deployed for more than 20 years, and there are billions of identifers that have been registered.Te early applications of the handle system were mainly focused on digital contentrelated felds, but with the deepening of industrial digital transformation and the development of the industrial Internet, the handle system has been expanded into manufacturing felds such as railways and construction.
However, the handle system is running by DONA and MPAs, so the centralized authority is still existing.Moreover, the MPAs are running by local companies or organizations, which cause the charge of the handle being not cheap, and it might be one of the reasons why it is hard to be applied in IoT scenarios, where the quantity of things is giant [35].
3.1.4.DID.Fundamentally, instead of identities being composed of accounts or identifers that are "borrowed" from providers, DID gives the controller the right to possess digital identities.As DIDs are more broadly adopted across the web, they give rise to a more resilient Internet, where digital identity is not borrowed from a provider, the way domain names and social media accounts are, but rather controlled by a controller and thus the basis of a new kind of verifable digital trust.Tis makes the Internet not only a more reliable tool but a more robust platform for creating richer digital experiences [36].Nevertheless, DID has been a hot topic in worldwide, just like the concept of blockchain, and there are remarkable implementations, pilots in diferent industrial areas, but whether it will bring revolutions both in social and economic remain to be seen.

Comparison.
We abstract these four systems from the perspectives of "readability," "security," "authority," "hierarchy," and "expansibility."Initially, "readability" means whether this identifer could be easily recognized by human.Te domain name can be easily memorized by people, while others are machine-readable only.Te complex coding schemes of OID, Handle, and DID are benefcial to industrial Internet use case, since more naming spaces provide more options, or rooms, for a huge number of industrial objects.Secondly, "security" ensures that the digital identity cannot be easily stolen or tampered.Te DID are naturally generated by encryption algorithm, which is not man-made rule or method, so it increase the impossibility to crack it.Tirdly, "authority" means whether this system is well adopted by society, and well used economical production.Since DNS has been operating for over 40 years, it is the most mature system comparing with others, especially DID, which is a new concept.Fourthly, "hierarchy" obviously means whether this system is root-based or centralized.DNS and OID have root server embedded in their system architecture, which can be seen from the structure.Handle weakens the notion of centralized root, and instead, it endows diferent characters (MPAs) to operate the separated roots.Unlike them, decentralized identity is a totally decentralized system, with no relying parties to control digital identifer, and no root server ever exists.Finally, "expansibility" means whether this digital identity can be adapted into diferent systems.To be more specifc, DNS are mainly used in Internet, and OID and Handle are mainly used in electronic medical records and literature publication.DID, however, as it does not rely on heavy system, it can be applied in any scenarios (Table 1).Mobile Information Systems It is estimated that the potential of decentralized identity market is $0.55 trillion.Although decentralized identity is more like an infant compared with other, together with of concept of Web3, Metaverse, and NFT, we believe that the "decentralized" idea will continue to hatch, and decentralized identity will defnitely make great impact on the transformation for today's Internet and Industrial Internet [37].

Figure 1 :
Figure 1: Some of the milestones in the internet digital identity evolution path.

apple.com.cn sina.com.cn
Figure 2: An illustration of the variable-depth tree structure of the domain name.

user Recursive server Recursive query Recursive query Root server Top-level server Second-level server
(ii) Resolvable.Since the DID is pairwise with a DID document, which contains the metadata of the subject, everyone could look it up to discover the metadata of the DID.(iii)Verifable.A DID is associated with one public/ private key pair, the controller of the private key can prove that they are the only owner of the DID.On the contrary, anyone could verify this DID to ensure it belongs to the real controller.(iv) Decentralized.Te cryptography mechanism eliminates the need for centralized registration

Table 1 :
Comparison between four systems.Te above paragraphs list the purpose, solution, structure, movement of the DNS, OID, Handle system, and DID four types of digital identity systems.Here we give personal viewpoints for these systems.
3.3.Future Directions.Strategic technologies that can play a transformative role usually go through a process from conception to technology testing, from technology pilot to large-scale application, and it probably takes 10 years or more for each generation to develop.We have to admit that all systems play pivotal roles in today's Internet and Industrial Internet or IoT activities.While with more and more privacy and security issues coming up, previous systems are sufering regeneration and iteration.Te DID depicts a new user-controlled digital world with new technology and concept.