‘The Sum Total of All Human Knowledge’, Part IV

Refining the schema

In the previous post in this series, we had arrived at the possibility of utilising a rigorous alphanumerical schema for indexing human knowledge based upon the temporal sequencing of the time-line of Cosmic Evolution and the through-line of Big History. This sequence was originally considered (Part I) as the most natural way to index knowledge disciplines, as it is both intuitively powerful (Part II), and based on the quasi-objective observable parameter of rising complexity over the course of cosmic time. Here we shall start to flesh out and fill in that indexing with an actual numerical scheme, based upon the final choice (Part III) of a combination of the Outline of Knowledge (OoK) from Encyclopedia Britannica (Adler 1994), and the Universal Decimal Classification (UDC) (UDC Consortium 2022).

The OoK+UDC schema, 1st approximation

In this post, the overall bulk structure of the schema will be described. This will include the numbering system from OoK for localising the knowledge discipline areas, as well as certain ‘auxiliary’ elements from the UDC that are concerned with indicating cross-references or relationships between related discipline areas. The structure of a zettel ID (ZID) will be defined and examined, and some final observations about the expandability of the OoK will set the scene for the next post which will extend the schema into its final, working form.

The OoK numbering

We chose in the previous post to use the number ‘0’ (zero) to represent what the OoK numbered as Part ‘10’ (ten). This is a simplification for the purposes of notation, certainly, but it also has a conceptual importance arising from the ‘hub’ notion also mentioned in that post. This will become clearer in the next post. For now, let us look at how the OoK schema numbers its knowledge areas.

The general numbering pattern of the OoK is best illustrated with an example, such as the one used in the opening section of the Propædia: ‘How to use the Propaedia’, namely the sectioning ID for the topic of Smelting (of ores in metallurgy). This example was likely chosen because it shows all 7 levels of sectioning in the OoK framework.

Technology is Part 7. There are three Divisions of Technology. These are: I. The Nature and Development of Technology; II. Elements of Technology; and III. Major Fields of Technology.

Under Division II, there are 5 Sections, the fifth of which is 5. Technology of Industrial Production Processes. Within each Section there may be further subdivisions, the topmost of which is rendered with capital letters. (These are not visible on the Wikipedia page, as the description there only goes down to the Section level, and no further. It is necessary to use the Propædia itself, whether in hardcopy or electronic form, to see these further subdivisions.) So, the four sub-sections of 5. Technology of Industrial Production Processes are labelled A, B, C and D. We are interested in B. Metallurgy. This sub-section in turn has two sub-sub-sections: 1. Mineral processing and 2. Extractive metallurgy, the latter of which contains the target discipline area. Extractive metallurgy itself has three further subdivisions: a. Pyrometallurgy; b. Electrometallurgy; and c. Hydrometallurgy. Our target is Pyrometallurgy, which itself has a further four subdivisions: i. Roasting; ii. Smelting; iii. Converting; and iv. Refining.

This looks like:

7. Technology

II. Elements of Technology

5. Technology of Industrial Production Processes

B. Metallurgy

2. Extractive metallurgy

a. Pyrometallurgy

ii. Smelting

The Section numbers in the OoK combine the Part (top-level, first digit), the Division (next-level, rendered in capital Roman numerals when referred to on their own; on the Wikipedia page these are instead rendered with Arabic numerals), with the third level being the Section number within the Division. These are expressed as 3 digits, with the order: Part-Division-Section. Thus: Section 5 of Division II of Part 7 is rendered as ‘725’. We can also make this numeral substitution (i.e., Roman to Arabic) for the bottom-most level, as well, so that ‘ii’ becomes ‘2’.

Therefore, combining the Section number with all the sub-section identifiers, Smelting ends up with the sectioning ID of 725.B.2.a.2. But most of those decimal points are redundant, because we already know the way the numbering system works and in what order which letters and numbers can occur next to each other. For the sake of brevity of notation, therefore, most of the decimal points can be removed without introducing undue ambiguity. Hence this can be written as 725B2a2.

There are only two instances where the 3rd-level Section number exceeds 9 (both of which are in 96 History: The Modern World). But because whatever (if anything) follows such a two-digit number will always be a letter, there should in principle be no confusion introduced, although sufficient ‘.’ could be used to remove any potential for ambiguity. Thus, Part 9. History, Division VI. The Modern World, Section 10. China until Revolution 1839-1911, Japan from Meiji Restoration to 1910 could be rendered as 9610, or equivalently as 96.10, with the sub-sections A and B dealing separately with China and Japan respectively having their further subdivisions. Thus, the topic A. China under the late Ch’ing: the challenges of rebellion and Western penetration is 9610A or 96.10A while B. Modernization of Japan and emergence as a world power (1868-c.1910) is equivalently 9610B or 96.10B. The other such instance is 96.11; in places where a similar situation obtains with more than 9 subdivisions of a numerical index character, a similar consideration applies.

For the special Part 10. Branches of Knowledge we simply omit the ‘1’ from the ‘10’ to get ‘0’ (as suggested in the previous post). So, from the Wikipedia page, we can see that Applications of Mathematics is 023; History & Philosophy of Science is 031; and Historiography is 041. And so on.

In this way, all sections of the knowledge surveyed in the Propædia can be specified by using a few alphanumerical characters. The numbering is both fixed and widely-known, and different researchers using this schema can easily compare notes from their respective collections. As noted above, a copy of the Propædia is needed to get down to the nitty-gritty level beyond the three levels that are visible on Wikipedia. My 1994 hardcopy was less than US$20 dollars from an online second-hand bookseller, while an electronic version of it can be found at the Internet Archive. My preference is to have the hardcopy, of course, since Physical ‘Zettelkasteneering’ is all about the hardcopy.¹

The UDC auxiliaries

One of the main features of the UDC is the use of the so-named ‘auxiliaries’ that may be appended to any or all of the classification numbers (or “classmarks”). Rather than being embedded into the decimal notation (as it frequently is for DDC), these are separated out with special characters that indicate a variety of extensions to the simple class numbering.

This facility allows new classmarks to be created as needed, for subjects or topics that may not exist in the schedules, by either linking numbers together from the main classes, or by adding numbers from the auxiliary tables. For a physical ZK this brings up the question of where such a combined classification is to be filed. Good practice guidelines exist for how this is to be achieved, essentially in numerical order and moving from the general to the specific (see, e.g., Batley 2014, 111ff), although it is possible to modify this filing convention if necessary for specific cases.

There are several types of auxiliaries, listed in detailed tables, including:

Co-ordination (i.e., linking), using a plus ‘+’ symbol, which connects two or more (non-consecutive) numbers to form a new compound topic number;
Consecutive extension, using a slash ‘/’ symbol, which denotes a consecutive sequence of numbers indicated by the first and last shown, generally to indicate a broad subject area;
Simple relation, using a colon ‘:’ sign, which indicates a relationship between two or more topics, without any particular ordering (so the filing convention mentioned above is used);
Order-fixing, using the double colon ‘::’, which emphasises that the order is important and that the reverse order does not have the same meaning (in this case the filing convention is not followed);
Place, indicated by parentheses ‘()’ with numbering running 1/9 (i.e., from 1 to 9, per the ‘/’ notation for consecutive extension, above); and
Time, indicated by double-quotes “…”.

Let us see how the auxiliary notation of the UDC can be used to render the sections mentioned above, namely 9610, with its two sub-divisions 9610A and 9610B .

In the OoK, ‘History’ has many subdivisions organised by place and time, with specific subdivisions applying to continents, regions, countries, and eras, etc. This is why Part 9. History has so many Divisions and Sections as compared with most other Parts.

In UDC, by contrast, the class for ‘General history’ is located at 94 and all place and time aspects are rendered using the auxiliaries of place and time attached to this single class number (as of the 1994 revision).

From the UDC auxiliary table for place (1e), we find that China is rendered as (510), and the main islands of Japan as (520). From the auxiliary table for time (1g), we see that we can use direct year ranges rendered using the consecutive extension symbol ‘/’ enclosed in “double quotes”.

The year range for 9610A is given as 1839-1911, which is rendered as “1839/1911” in the UDC time auxiliary notation. The Meiji Restoration took place in 1868, so the time notation for 9610B will be “1868/1910”.

Thus, 9610A has the UDC classmark: 94(510)“1839/1911”, where 94 = general history; (510) = China, and the year range is as shown. Similarly, 9610B has the UDC classmark: 94(520)“1868/1910”, where (520) = ‘mainland’ Japan.

So, we can now form the equivalent of 9610 in OoK by combining these two classmarks using the co-ordination ‘+’ symbol, whence 9610 becomes 94[(510)“1839/1911”+(520)“1868/1910”] in UDC, where the square brackets are used for grouping the two separate place/time auxiliaries under the single class 94.

Here then is an example of both the specifying power of UDC and the virtue of the looser aggregation of OoK. There is clearly much more scope for flexibility in defining unusual or ad hoc classmarks in UDC, which can be utilised if needed in very special cases. But the OoK utilises a relatively few broader and more tractable place/period groupings, which are conceptually easier to remember. For our purposes, it is mainly the flexibility of the +, :, and :: operators that is of most interest, due to the connections that may be indicated within the zettel’s very identifying number itself.

Structure of the ZID

The number of an individual zettel—what we have called a ‘zettel ID’ (‘ZID’)—can be considered as consisting of three main parts:

the alphanumerical code defining its placement into a particular section in the overall knowledge framework—the ‘section ID’, or (borrowing a term from UDC) ‘classmark’;
a separator character that indicates that the sectioning information has ended and that further numbering is now concerned with the relative placement of the zettel within that section; and
the alphanumerical code that actually distinguishes the particular zettel from, and its location and relationship with respect to, the other zettels already in that section—the ‘in-section ID’.

The separator character is specifically mentioned here as a distinct element because what that character is (or can be) may depend upon whether the implementation is a physical Zettelkasten (PZK) in a card file, or a digital one (DZK) in a computer. A common character for analogue/physical cards is the slash character ‘/’. Unfortunately, this can sometimes be a problem for computer operating systems, such as those that use this character to denote directory structures. In that case, for DZKs on these operating systems a different character must be used, say, a double-dash ‘--’, since the spacing this produces would help visually distinguish the two parts of the ZID in a filename or elsewhere. The use of the slash character ‘/’ as a separator could also be a problem if use is also made of the ‘consecutive extension’ ‘/’ character from the UDC in the OoK sectioning ID.

Schematically, the ZID can be regarded as follows:

§ | #

where ‘§’ is the section identifier or classmark (here rendered with the well-known ‘section’ character); the vertical line ‘|’ represents the separator, whatever that may actually be; and ‘#’ represents the zettel’s unique in-section location within the section in which it is filed.

The ZID’s section identifier/classmark § will depend on where in the overall knowledge structure of the OoK+UDC schema it most naturally fits. But the possibility also exists to highlight in the § numbering that there may be a relationship with another knowledge section(s) elsewhere in the overall knowledge structure. The linking made possible from using UDC auxiliaries also means we can cross-link directly from the ZKID itself—not just from symbols written inside the note’s text—with the filing order modified as per the UDC convention (including special ordering when using symbols). In this case, one could use the ‘+’ linking notation, as well as the ‘simple relation’ colon ‘:’ auxiliary notation if the relationship is symmetrical or balanced, or the ‘order-fixing’ double-colon ‘::’ notation if the ordering shown is important and not reversible. Other possible uses of the auxiliaries might also be considered, such as timings, places etc, if such a degree of specificity is required for filing a zettel (but generally it won’t be).

The zettel’s unique in-section ID # will depend on the timing of when the zettel is installed into its place in the ZK. If it is the very first zettel to go into a particular section, it will have complete freedom of numbering; it will probably get the unique ID ‘1’. (The numeral ‘0’ is held in reserve against possible future use for indexing that section.) Subsequent zettels filed into that section will either be directly related to it, in which case—following Luhmannian practice (Schmidt 2018, sect.4)—the next one will probably get an ID of ‘1a’, or not be related to it, so it would get a different independent number. A subsequent zettel that is related to this second one that is related to the first one, will end up being ‘1a1’. Another zettel that is related to 1 but not the two branched under it (1a and 1a1), would get ‘1b’. A zettel that is not related to any of these might get ‘2’, and so on. Luhmann’s numbering system utilised this alternating pattern of letters and numbers to allow for an almost infinite interpolation of new zettels into the existing set, without having to re-number any of the IDs of those installed prior. This meant he sometimes had zettels numbered with up to 13 characters (Schmidt 2016, 301). It is a very ingenious method and one well worth adopting for the in-section ID # part of the ZID, once the numbering system is learned and internalised.

In a very real sense, the OoK numbering schema for the section ID § greatly resembles the Luhmannian numbering system for the in-section ID #, which was able to expand almost infinitely ‘into’ the ‘space’ between existing in-section IDs. This reveals a very useful property of the form of the OoK numbering system—it’s expandability to allow for new categories of knowledge to be created and used when needed.

Room to ‘expand’ the OoK schema

One of the fortunate aspects of the OoK structure is that its enumerative outline numbering framework still has a good deal of ‘empty space’ within it which is available to ‘extend into’ as our knowledge interests develop and expand. To see what this means, try to visualise a circular disc, with the special ‘hub’ sector, ‘0’, at the centre, and the nine main knowledge sectors 1-9 arranged around it outside (i.e., Parts 1-9 in the original OoK). This was the diagram Adler (1994, 7) used to explain the special nature of Part 10 (i.e., our sector ‘0’).

Within each of the nine outer sectors, there are Divisions—usually between two and seven of these—which do not necessarily ‘fill’ the ‘space’ theoretically available in that sector, and take up only a portion of it, by dint of their potentially open-ended numbering. Within each of these Divisions there are several Sections, generally between two and nine (apart from the special case of the 11 found in 9.6 alluded to above). These, too, do not ‘fill’ the ‘numbering space’ theoretically available within each Division. These Sections may in turn be subdivided down to further subsections, sub-subsections, etc, but in each case, the space allocated to a new subsection is less than what is numerically ‘available’ for this purpose. Hence, presently, there still exist large areas in the Outline structure containing nothing, not even (as it were) ‘sectioning’ information, while only a fraction of even these sectioned areas are actually ‘filled in’ (so to speak) with knowledge. These therefore have room for more to be added, and await our further ‘filling in’ of them as our knowledge accrues and expands beyond the present set of categories. One metaphor that comes to mind is that of a computer disc that has only been partially formatted, and still contains large unformatted areas waiting to be utilised if the need arises.

So, in short, the OoK structure is by no means ‘exhausted’ in terms of what knowledge it could be used to index at finer levels of detail. While we will generally not go to the extreme of using all seven sub-levels of the OoK classification, we nonetheless could do so, and indeed could go beyond even that if it ever became necessary, by extending the sub-sectioning alphanumerical notations in suitably unambiguous ways. However, going too far down the outline hierarchy may be counterproductive (paying mind to Luhmann’s warning mentioned at the end of the previous post), so it is not something to be done before the necessity; some looseness of filing is both useful and desirable, to allow for cross-fertilisation of ideas and concepts. The UDC auxiliaries are not only suited to this, but even designed to do so, as the quote taken from the UDC Consortium web page noted near the end of the previous post.

Therefore, expanding upon the existing framework is as simple as adding an extra Division, Section, Sub-section, or whatever is required, as the need arises. We will do this in the next post when we come to create a home for Futures Studies in the schema, as well as when we add in certain useful elements from the UDC’s own 00 section concerning the processes of science and knowledge. In essence (pursuing the computer disc metaphor), sector 00 will become a sort of ‘boot sector’ for scholarship and knowledge seeking, establishing certain concepts and distinctions, as well as producing filing spaces and categories, that will be very useful in the newly expanded overall schema.

Stay tuned.

Next Time: Part V: Extending the schema

Note

Of course, there are differences between my 1994 hardcopy and the electronic copy at The Internet Archive, which appears to date from around 2010 (i.e., the last one published). Ideally, one would print out the electronic version, cut out the various index terms and reading suggestions, and just keep the actual Outline text, pasted onto sheets and photocopied. This would considerably reduce the number of printed pages that would need binding for a complete final form of the Outline.

References

Adler, Mortimer J. 1994. “The Circle of Learning.” In Propædia: Outline of Knowledge & Guide to the Britannica, 15th edn, 30:5–8. The New Encyclopaedia Britannica. 32 vols. Chicago: Encyclopaedia Britannica Inc.

Batley, Sue. 2014. “Classification Schemes for Specialist Collections.” In Classification in Theory and Practice, edited by Sue Batley, 2nd ed., 87–142. Oxford: Chandos Publishing. doi:10.1533/9781780634661.87.

Schmidt, Johannes F.K. 2016. “Niklas Luhmann’s Card Index: Thinking Tool, Communication Partner, Publication Machine.” In Forgetting Machines: Knowledge Management Evolution in Early Modern Europe, edited by Alberto Cevolini, 289–311. Library of the Written Word vol. 53. Leiden & Boston: Brill. doi:10.1163/9789004325258_014.

———. 2018. “Niklas Luhmann’s Card Index: The Fabrication of Serendipity.” Sociologica 12 (1): 53–60. doi:10.6092/issn.1971-8853/8350.

UDC Consortium. 2022. “About Universal Decimal Classification (UDC).” UDC Consortium. https://udcc.org/index.php/site/page?view=about.