21st Century Library and Information Science

Time .. always changing things ..

Time .. always changing things ..

Some thoughts based on my recent presentation to the INFODAYS14 conference held at Charles University, Prague, Czech Republic 5-7th November 2014.

The future of library and information science (LIS), is inextricably linked to the future of the document. Leaving aside for the moment, the question of exactly “what is a document?” this comes as no surprise to those of us working in this area, as we know that civilization owes its existence to recorded knowledge. For the time being also, let us allow ourselves to consider knowledge and information to be interchangeable terms, and within our LIS discipline, we will assume that for the purposes of communication, such knowledge or information must be instantiated as a document of some kind.

We can further understand that documents contain, and therefore allow access to, ‘formal’ information – i.e. something that is published, and therefore subject to the processes of the information communication chain, [Popper’s World III, instantiated in World I, physical objects]. This is in contrast to ‘informal’ information, which remains inside someone’s head – although developments in telepathic communication are starting to enter the research literature, we are still some way away from being able to intercept and understanding the thought processes of another being, [Popper’s World II].

I am often called on to comment on the nature of library and information science. To reiterate, here at City University London, we consider the discipline of library and information science to be the study of documents, on their journey through the information communication chain.

The realm of library and information science then, is the realm of the document. We, as researchers and practitioners within this field, are concerned with the activities surrounding the creation, dissemination, management, organisation and retrieval, and use of documents. We study these processes of documentation through the lens of Hjørland’s concept of domain analysis, invoking study and praxis within facets such as knowledge organisation, information retrieval, document preservation, historical studies, and research into information behaviour.

The changes in this chain of events, are driven largely by technology, although factors such as economics, politics and social tastes are all able to impact on the business of recorded knowledge.

Let us take a look at some of the developments that can be seen to be influencing the nature and definition of documents, as 2014 draws to a close.

Information Theory

A quick scan of the literature on the definition of information, reveals the troubled history of the concept central to our discipline. To-date, there is no single, satisfactory explanation of what information actually is. There are many attempts at definition, and indeed theories, both within the field of LIS and within other, seemingly unrelated disciplines. Resurgence in interest in information communications theory, can be seen to have heralded interest in information physics, philosophy of information and information biology. To some, the concept of information in these fields remains unconnected to the social discipline of library and information science, but to others, there is an interest in looking for connections and a possible theory of unification. See Bawden and Robinson papers below, for further reading in this latter area.

Data Science

The jump to prominence of data science and related areas (big data, data analytics, data visualisation) can be seen in the number of recent university courses being introduced (we have introduced one this year at City University), and reflected in the required skills listed in many job advertisements. One of the top skills sets required by employers across the sectors is the ability to collect, analyse and interpret data. Data handling and communication are now sitting alongside the more traditional ‘verbal and written communication skills’ that professional level work demands. These skills are becoming essential to practitioners within library and information science as e-science and the digital humanities pour more and more data sets into our sphere of influence. These digital data files are one example of the new forms of document that require the attention and understanding of members of our discipline. The move to open data, and the expectation that data will be published alongside findings are changing the way library and information professionals support scholarly communication. Indeed the move towards digital scholarship heralds a new era for partnerships between librarians and researchers.

Digital Humanities

The mass digitization of literature, poetry, art and music has led to an increase in materials and methods available for humanities based studies. There is a movement towards situating digital humanities research within the library and information environment, which seems to many, to be its natural home. Again, as with e-science, the availability of large data sets and multi-media files is fuelling new growth areas for understanding patterns and trends (text mining), and for facilitating the final convergence of the GLAM sector, where digital renderings of text, image, sounds or even objects bring the previously separate collection disciplines into a melting pot of new services and interpretations. We are witnessing new roles redefining library and information science as a producer of new content, understanding and insight, supporter of new forms of scholarship, and a leader in scholarly communications.

Publishing

Library and information science has always had an intimate relationship with the processes of publishing and dissemination. Changes in both scholarly and trade publishing are well documented, driven by the open access movement and the demand for new models of consumption respectively. The rapid growth of mobile devices and social media has revolutionised what it means to be an author and what it means to be a disseminator or a reader. It is probably fair to say that anyone with access to technology (not everyone) can be both an author and a publisher. New mechanisms for content creation (image/media capture devices, writing for transmedia) allow new forms of documents (interactive narratives) to flourish, and we are seeing a move towards content marketing, an increase in the use of images or video over text, and in data mashups. New tools to help us understand the reach and potential impact of new publishing formats, referred to as altmetrics, are entering the armoury of library and information science alongside existing bibliometric and informetric analyses. What it means to publish is changing alongside the development of the document.

Computer Science/Technology

Technological advances undoubtedly drive the most significant changes in the form and nature of documents.

I have written previously in this blog, that developments in pervasive computing, multisensory network technologies and participatory human computer interfaces will allow new forms of documents to emerge, specifically ‘immersive’ documents, where unreality can be perceived as reality. News of current developments in virtual reality headsets and roomscape projection abounds, and consumer versions of games, narratives and training scenarios appear to be just around the corner timewise, rather than siting themselves somewhere in the mid to long term future. Before we arrive at the availability of completely immersive documents, we will see a range of lesser, participatory experiences, such as the interactive, transmedia narratives mentioned in the previous section. In these narratives, the story reaches out beyond the imaginary world, into the reality of the reader, with texts, phone calls and connections, seemingly coming from characters within the plot. The way the narrative plays out can be influenced by the reader, as can the ending.

The blurring of boundaries between a game, a learning experience or pure fiction with this type of document is evident. There will also be ethical implications with regard to how these documents are used.

Implications for Library and Information Science

As documents evolve, so then will the scope and processes of what we understand as library and information science. New forms of document will require extensions and adaptations to our current tools for knowledge organisation, new information architectures and new understandings of human information behaviours. Most interestingly perhaps, for the LIS profession, will be the need to engage with and promote ‘immersive literacy’, possibly in a similar way to which Gilster suggested for digital literacy less than two decades ago.

Further Reading

Bawden D and Robinson L (2013). “Deep down things”: in what ways is information physical, and why does it matter for LIS? Information Research 18(3), paper C03 [online], available at http://InformationR.net/ir/18-3/colis/paperC03.html

Gilster P (1997). Digital Literacy. New York NY: Wiley, New York.

Hjørland B (2002). Domain Analysis in Information Science: Eleven approaches – traditional as well as innovative. Journal of Documentation, vol 58(4), 422-462.

Robinson L (2009). Information Science: the communication chain and domain analysis. Journal of Documentation, vol 65(4), 578-591.

Robinson L and Bawden D (2013). Mind the gap: transitions between concepts of information in varied domains. In: Theories of information, communication and knowledge. A multidisciplinary approach. Eds. Ibekwe-SanJuan F and Dousa T. Springer.

Robinson L (2014a). Multisensory, Pervasive, Immersive: towards a new generation of documents. Journal of the Association for Information Science and Technology, in press.

Robinson L (2014b). Immersive information behaviour; using the documents of the future. New Library World, in press.

Information is the new black

It must be the popularising effect of James Gleick’s new book “The Information”, because suddenly everyone I meet wants to talk about information: its history, its epistemology and Shannon-Weaver’s 1948 mathematical theory of communication (MTC), which became known as the mathematical theory of information. This is certainly good news for our information science course, where information has been considered from an academic perspective since 1961. I feel my time has come; all those hours spent memorizing equations to show that I truly, deeply understood how many signals you can push down a channel of a certain size, allowing for noise, have finally been rewarded, and I can now brandish my information-science credentials with a superior air of I told you so. Information is the new black, and everyone is wearing it.

I believed that I would forget Shannon’s theory entirely, as soon as the exam was over. It did not seem so relevant to my work at the time, which was with information resources in toxicology. Life, however, with a patient smirk, ensured that the ashes of the MTC rose like a phoenix 20 years later, when I was faced with presenting the mathematical good news to contemporary LIS students taking our Library and Information Science Foundation module as part of their masters. I dusted off my 1986 copy of Robert Cole’s “Computer Communications”, my notes still there in the margins of page 10, where I left them.

The issue I faced was one of presenting a definition of ‘information-science’, and of outlining its history as a discipline, to modern LIS students. Many of the papers considering the origins of information science gaze back in time to illuminate Shannon’s equations with a rosy pink glow, suggesting that his theory somehow led to the birth of information science as a true science (Shera 1968, Meadows 1987). This was the story in the 1980s, but in the 21st century, a more plausible thread is emphasized, the work of Kaiser, Otlet and Farradane on the indexing of documents, which suggests that the MTC was a bit of a red herring in respect to the history of information science. Rather then that information science grew out of a need to control scientific information, coupled with the feeling amongst scientists that this activity was somehow separate from either special-librarianship or the more continental term for dealing with the literature, documentation (see Gilchrist 2009, Vickery 2004, Webber 2003).

MTC

A look back at the original ideas and documents show that Shannon’s work was built on that of Hartley (1928). Stonier (1990 p 54) refers to Hartley:

“.. who defined information as the successive selection of signs or words from a given list. Hartley, concerned with the transmission of information, rejected all subjective factors such as meaning, since his interest lay in the transmission of signs or physical signals.”

Consequently, Shannon used the term information, even though his emphasis was on signalling. The interpretation of the MTC as a theory of information was thus somewhat coincidental, but this did not prevent it being embraced as a foundation of a true ‘information science’.

Shannon himself suggested that there were likely to be many theories of information. More recently, contemporary authors such as Stonier (1992) and Floridi (2010), have reiterated that MTC is about data communication rather than meaningful information.

Floridi (2010 p 42 and 44) explains:

“MTC is primarily a study of the properties of a channel of communication, and of codes that can efficiently encipher data into recordable and transmittable signals.”

“.. since MTC is a theory of information without meaning, (not in the sense of meaningless, but in the sense of not yet meaningful), and since [information – meaning = data], mathematical ‘theory of data communication’ is a far more appropriate description…”

He quotes Weaver as confirming:

“The mathematical theory of communication deals with the carriers of information, symbols and signals, not with information itself.”

Floridi’s definition of information as ‘meaningful data’ is more aligned to the field of information science as understood for our LIS related courses. Whilst we can still argue what is data and what is meaning, we can see that the MTC utilizes ‘information’ as a physical quantity more akin to the bit, rather than the meaningful information handled by library and information scientists.

This difference is set out  by Stonier (1990, p 17):

“In contrast to physical information, there exists human information which includes the information created, interpreted, organised or transmitted by human beings.”

Nonetheless, the MTC is still relevant to today’s information science courses because it has a played a pivotal role in the subsequent definitions and theories about information per se. And it is rather hard to have information science without an understanding of ‘information’. Many papers have been written on theories of information, and on the relevance of such theories to information science (see, for example Cornelius 2002).

MTC and other disciplines

The MTC provides the background for signalling and communication theory within fields as diverse as engineering and neurophysiology. At the same time that Shannon was writing, Norbert Wiener was independently considering the problems of signalling and background noise. Wiener (1948 p 18) writes that they:

“.. had to develop a statistical theory of the amount of information, in which the unit amount of information was that transmitted as a single decision between equally probable alternatives.”

Further (p 19), that

“This idea occurred at about the same time to several writers, among them the statistician R.A. fisher, Dr. Shannon of the Bell Telephone Laboratories, and the author.”

Wiener decided to:

“call the entire field of control and communication theory, whether in the machine or in the animal, by the name Cybernetics”.

The relationship of information to statistical probability (the amount of information being a statistical probability) meant that information in Shannon and Wiener’s sense related readily to entropy (anecdotally von Neumann is said to have suggested to Shannon that he use the term entropy, as it was already in use within the field of thermodynamics, but not widely understood).

“The quantity which uniquely meets the natural requirements that one sets up for ‘information’ turns out to be exactly that which is known in thermodynamics as entropy.”

Shannon and Weaver (1949) p 103

“As the amount of information in a system is a measure of its degree of organization, so the entropy of a system is a measure of its degree of disorganization; and the one is simply the negative of the other.”

Wiener (1948) p 18

The link between information and entropy had been around for some time. In 1929, Szilard wrote about Maxwell’s demon, which could sort out the faster molecules from the slower ones in a chamber of gas. Szilard concluded that the demon had information about the molecules of gas, and was converting information into a form of negative entropy.

The term ‘negentropy’ was coined in 1956 by Brillouin:

“… information can be changed into negentropy, and that information, whether bound or free, can be obtained only at the expense of the negentropy of some physical system.”

Brillouin (1956) p 154

Brillouin’s outcome was that information is associated with order or organization, and that as one system becomes organized, (entropy decrease), another system must becomes more disorganized (entropy increase).

Stonier (1992 p 10), agrees:

“Any system exhibiting organization contains information.”

A well-known anomaly becomes apparent, however, when over 60 years later we try to understand the correlation between information and either entropy or probability. A trawl through the original equations and explanations, and subsequent revisitations, reveals that an increase in information can be associated with either an increase or decrease in entropy/probability according to your viewpoint. Tom Stonier (1990) refers to this in chapter 5, but Qvortrup (1993) gives a more detailed explanation:

“In reality, however, Wiener’s theory of information is not the same, but the opposite of Shannon’s theory. While to Shannon information is inversely proportional to probability, to Wiener it is directly proportional to probability. To Shannon, information and order are opposed; to Wiener they are closely related.”

The correlation between the measurement of entropy and information did however, lead to the separate field of information-physics, where information is considered to be a fundamental, measurable property of the universe, similar to energy (Stonier 1990).

This field stimulates much debate, and is currently enjoying what passes for popularity in science. A recent article in New Scientist tells how Shannon’s entropy provides a reliable indicator of the unpredictability of information, and of thus of uncertainty, and how this has been related to the quantum world and Heisenberg’s uncertainty principle. Ananthaswamy (2011).

Information-biology also appears to stem from work undertaken around the MTC. The connection between signalling in engineering and physiology was made by Wiener in the 1940s, and in 1944 Schrödinger, in his book “What is Life?”, made a connection with entropy as he considered that a living organism:

“… feeds upon negative entropy.”

Further that:

“.. the device by which an organism maintains itself stationary at a fairly high level of orderliness (= fairly low level of entropy) really consists in continually sucking orderliness from its environment.”

In the same book, Schrödinger outline the way in which genetic information might be stored, although the molecular structure of DNA was not published until 1953, by Crick and Watson (see Crick 1988). The genetic information coded in the nucleotides of the DNA is transcribed by messenger RNA and used to synthesize proteins. Information contained in genetic sequences also plays a role in the inheritance of phenotypes, so that informational approaches have been made within the study of biology (see Floridi 2010, also for discussion of neural information).

Information and LIS

For the purposes of our library and information science courses here at City University, we consider information as that which is ‘recorded for the purposes of meaningful, human communication’. Although I personally find Floridi’s definition helpful, information in our model is open to definition and interpretation, and is often used interchangeably with the term ‘knowledge’. In either case we regard the information as being instantiated within a ‘document’. The term ‘document’ also does not demand a definitive explanation, it merely needs to be understood as the focus of ‘information science’, its practitioners and researchers.

To complete the picture, when I became Program Director for #citylis at City University London, I wanted to strengthen and clarify the way in which we defined ‘information science’, and particularly to explain its relationship with library science (Robinson 2009). I suggested that library science and information science were part of the same disciplinary spectrum, and that information science (used here to include library-science) could be understood as the study of the information-communication chain, represented below:

Author  —> Publication and Dissemination —> Organisation —> Indexing and Retrieval —>  User

The chain represents the flow of recorded information, instantiated as documents, from the original author or creator, to the user. The understanding and development of the activities within the communication chain is what library and information specialists do in both practice and research. As a point of explanation, I take organisation in the model to include the working of actual organisations such as libraries and institutions, information management and policy, and information law. Information organisation per se, fits within the indexing and retrieval category.

Our subject is thus a very broad area of study, one which is perhaps better referred to as the information sciences. The question of how we study the activities of the model can be answered by applying Hjorland’s underlying theory for information science, domain analysis (Hjorland 2002). The domain analytic paradigm describes the competencies of information specialists, such as knowledge organization, bibliometrics, epistemology and user studies. The competencies or aspects distinguish what is unique about the information specialist, in contrast to the subject specialist. Further, domain analysis can be seen as the bridge between academic theory and vocational practice; each competency of domain analysis can be approached from either the point of view of research or of practice.

There are many definitions of information science, and there are other associated theories or meta-theories. The latter of which may also be associated with a philosophical stance. Nonetheless, the model portrayed above has proved to be a robust foundation for teaching and research, yet it is flexible enough to accommodate diverse opinions and debate as to what is meant by ‘information’. It allows for diverse theories of information.

It is interesting to reflect on whether ‘information’ as understood for the purposes of library and information science has any connection with ‘information’ as understood by physics and/or biology, or whether it is a standalone concept. Indeed later authors such as Bateson (1972) have suggested that if information is inversely related to probability, as Shannon says, then it is also related to meaning, as meaning is a way of reducing complexity. Cornelius (2002) reviews the literature attempting to elucidate a theory of information for information science (see also Zunde 1981, Meadow and Yuan 1997).

At a recent conference in Lyon, Birger Hjorland’s (2011) presentation considered the question of whether it was possible to have information science without information. He writes that there should at least be some understanding of the concept that supports our aims, but concludes:

“.. we cannot start by defining information and then proceed from that definition. We have to consider which field we are working in, and what kind of theoretical perspectives are best suited to support our goals.”

I agree with him. I do not think we can have information science without a consideration of what we mean by information – but information is a complex concept, and one that can be interpreted in several ways, according to the discipline doing the interpretation, and then again within any given discipline per se. It is not an easy subject to study, despite its sudden popularity. The literature of information theory is extensive, and scary maths can be found in most of it. Nonetheless, it is essential for anyone within our profession to have in mind an understanding of what we are working with; otherwise it is impossible to justify what we are doing, and we appear non-descript. Understanding information is like wearing black. Any colour will do, but black makes you look so much taller and slimmer.

References

Ananthaswamy A (2011). Uncertainty untangled. New Scientist. 30th April. 2011, 28-31

Bateson G (1972). Steps to an ecology of mind. Ballantine: New York

Brillouin L (1956). Science and information theory. Academic Press: New York

Cornelius I ( 2002). Theorizing information science. Annual Review of Information Science and Technology 2002. 393-425

Crick F (1988). What mad pursuit. A personal view of scientific discovery. Penguin: London

Floridi L (2010). Information: a very short introduction. Oxford University Press: Oxford

Gilchrist A (2009). Editorial. In: Information science in transition. Facet: London

Hartley RVL (1928). Transmission of information. Bell system Tech. Journal, vol 7 535-563

Hjorland B (2011). The nature of information science and its core concepts. Paper presented at: Colloque sur l’épistémologie comparée des concepts d’information et de communication dans les disciplines scientifiques (EPICIC), Université Lyon3, April 8th 2011. Available from: http://isko-france.asso.fr/epicic/en/node/18

Meadow CT and Yuan W (1997). Measuring the impact of information: defining the concepts. Information Processing and Management, vol 33(6) 697-714

Meadows AJ (1987). Introduction. In: The origins of information science. Taylor Graham: London

Qvortrup L (1993). The controversy of the concept of information. Cybernetics and Human Knowing, vol 1(4) 3-24

Robinson L (2009). Information science: communication and domain analysis. Journal of Documentation, vol 65(4) 578-591

Schrödinger E (1944). What is life? The physical aspect of the living cell. Cambridge University Press: Cambridge

Shannon CE and Weaver W (1949). The mathematical theory of communication. University of Illinois Press: Urbana

Shera JH (1968). Of librarianship, documentation and information science. Unesco Bulletin for Libraries, 22(2) 58-65

Stonier T (1992). Beyond information. The natural history of intelligence. Springer-Verlag: New York

Stonier T (1990). Information and the internal structure of the universe. Springer-verlag: New York

Szilard L (1929). Uber die Entropieverminderung in einem thermodynamischen System bei Eingriffen intelligenter Wesen. Zeitschrift fur Physik, vol 53 840-856

Vickery B (2004). The long search for information. Occasional Papers no. 213. Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign

Webber S (2003). Information science in 2003: a critique. Journal of Information Science, vol 29(4) 311-330

Wiener N (1948). Cybernetics: or control and communication in the animal and the machine. Wiley: New York

Zunde P (1981). Information theory and information science. Information Processing and Management, vol 17(6) 341-347