A repos­i­to­ry stores data that can be retrieved and modified later. Different types of repos­i­to­ries exist. They can be used for version control, metadata and other purposes.

What is a repos­i­to­ry?

“Repos­i­to­ry” means “storage” and comes from the Latin word repos­i­to­ri­um. In software tech­nol­o­gy, a repos­i­to­ry is a digital archive in which data, documents, de­vel­op­ment progress, metadata and programs can be stored and shared. Version control is another feature of repos­i­to­ries. Depending on the intended use, this tech­nol­o­gy enables large teams or com­mu­ni­ties working all over the world to col­lab­o­rate on a shared project. Available types of repos­i­to­ries differ in terms of their approach and structure. The best-known repos­i­to­ries include GitHub and the Google Repos­i­to­ry.

The basis for a repos­i­to­ry is usually a database, which, depending on re­quire­ments, can be set up on a local hard disk or a server, or can also be dis­trib­uted across numerous servers in a content delivery network (CDN). Data catalogs are created that contain the forms and rep­re­sen­ta­tions of various stored objects and provide in­for­ma­tion about their re­la­tion­ship to each other. All this in­for­ma­tion is stored in the form of metadata and can be searched for, retrieved, modified and adapted at any time with the ap­pro­pri­ate au­tho­riza­tion.

How is a repos­i­to­ry struc­tured?

To il­lus­trate how a repos­i­to­ry is struc­tured, let’s visualize a tree. In software de­vel­op­ment, you can even see this reflected in the ter­mi­nol­o­gy. Here a dis­tinc­tion is made between the trunk, which contains the current version of a project and the source code, and the branches, where edits are stored. Changes are later added back to the trunk so that all par­tic­i­pants have access to them. Storage works via tags.

What types of repos­i­to­ries are there?

Not all repos­i­to­ries are the same. They differ by their type of archive. Different ap­proach­es exist. The following are the best-known ones.

Repos­i­to­ry for version man­age­ment

In version man­age­ment, the aim is to store data in a clear manner while logically working out steps and con­nec­tions in a common archive. Source code files and other data are stored and archived. Data can be copied from the repos­i­to­ry to a local hard drive for de­vel­op­ers to continue working with them. This process is referred to as “checking out”. The developer then works with the local data, making changes or dis­card­ing previous changes. Once the work is complete, the latest state of the project is uploaded back to the repos­i­to­ry, which is referred to as “checking in”. All changes and comments are logged during this process.

This approach has several ad­van­tages. For one, users can col­lab­o­rate on a project without over­writ­ing older versions. Instead, all status updates are logged, making it possible to return to a previous version. A repos­i­to­ry enables small and large teams to col­lab­o­rate on the same project. Updates can be made si­mul­ta­ne­ous­ly without over­writ­ing statuses or changes being lost. The­o­ret­i­cal­ly, all users can continue a project at any state without any risks.

The most popular version control systems include CVS, GitHub and SVN.

Repos­i­to­ry for metadata

A repos­i­to­ry for metadata tends to be used in highly complex IT in­fra­struc­tures. Such a repos­i­to­ry contains the data of the entire system as well as in­for­ma­tion about the in­fra­struc­ture’s context and en­vi­ron­ment. The advantage of this type of repos­i­to­ry is that changes can be made without altering the source code or needing to implement ad­di­tion­al programs. Instead, the database table, which is the basis for the re­spec­tive system, is adapted in a straight­for­ward manner. The metadata repos­i­to­ry tends to be used in en­ter­prise ap­pli­ca­tion in­te­gra­tion (UAI) and data ware­hous­ing.

Repos­i­to­ry for software

A software repos­i­to­ry is par­tic­u­lar­ly important for Linux users. A software repos­i­to­ry contains ap­pli­ca­tion packages and the cor­re­spond­ing metadata such as ex­pla­na­tions, an­no­ta­tions, de­pen­den­cies and changes. In­stal­la­tion and updates are performed using a package manager. In this way, users don’t have to worry about updating their ap­pli­ca­tions. Instead, the system is updated au­to­mat­i­cal­ly. The updates them­selves are often provided by the community. Users main­tain­ing packages, known as package main­tain­ers, typically provide the updated data and carry out the main­te­nance of the re­spec­tive software repos­i­to­ry.

Repos­i­to­ry for document servers

The term repos­i­to­ry is also applied to extensive network pub­li­ca­tions and document servers, at least fig­u­ra­tive­ly. Although some special features of the repos­i­to­ry principle aren’t adopted one to one, the procedure is adapted for use. Well-known document servers such as arXiv publish pub­li­ca­tions from the fields of biology, computer science, math­e­mat­ics, physics and sta­tis­tics. An expert reviews new articles and approves or rejects them. The sci­en­tif­ic works can then be made available for download. However, in contrast to a version control repos­i­to­ry, it is not possible to edit documents.

Repos­i­to­ry for CASE

A repos­i­to­ry is also fre­quent­ly used in computer-aided software en­gi­neer­ing. It’s mainly used to store project data, doc­u­men­ta­tion and source code.

Which repos­i­to­ries are useful?

Numerous types of repos­i­to­ries are available for different purposes. A dis­tinc­tion is made between solutions that are open source and those offered com­mer­cial­ly. The most popular open-source repos­i­to­ry is GitHub. However, there are various GitHub al­ter­na­tives such as Apache Allura, Bazaar, Gitolite, Mercurial or Source­Forge. A detailed com­par­i­son of GitHub and GitLab is available in our Digital Guide. Among the best-known pro­pri­etary repos­i­to­ries are Alien­brain, Bitkeeper, IBM Rational Synergy and MySQL Yum.

Whether a repos­i­to­ry is suitable for your project depends on your re­quire­ments and your way of working. For teamwork, a repos­i­to­ry can improve work processes and optimize workflow. Even if employees access a project and make changes at different times and from different locations, the trunk is always secure. Solutions can be tested without jeop­ar­diz­ing previous progress. It’s a good idea to test an open-source solution before pur­chas­ing a com­mer­cial option.

How does a repos­i­to­ry work?

Used correctly, a repos­i­to­ry offers several ad­van­tages. GitHub is a great example of this. Once you’ve installed and set up GitHub, you can use the intuitive user interface to assign and process tasks. Commits and pulls are used for listed changes. In this way, a team leader can track in­di­vid­ual progress steps and members can follow the project down to the smallest detail. To learn more about GitHub, have a look at our Git tutorial.

Go to Main Menu