Digital preservationIn library and archival science, digital preservation is a formal endeavor to ensure that digital information of continuing value remains accessible and usable. It involves planning, resource allocation, and application of preservation methods and technologies, and it combines policies, strategies and actions to ensure access to reformatted and "born-digital" content, regardless of the challenges of media failure and technological change. The goal of digital preservation is the accurate rendering of authenticated content over time.
Born-digitalThe term born-digital refers to materials that originate in a digital form. This is in contrast to digital reformatting, through which analog materials become digital, as in the case of files created by scanning physical paper records. It is most often used in relation to digital libraries and the issues that go along with said organizations, such as digital preservation and intellectual property. However, as technologies have advanced and spread, the concept of being born-digital has also been discussed in relation to personal consumer-based sectors, with the rise of e-books and evolving digital music.
Internet ArchiveThe Internet Archive is an American digital library founded on May 10, 1996, and chaired by free information advocate Brewster Kahle. It provides free access to collections of digitized materials like websites, software applications, music, audiovisual and print materials. The Archive is also an activist organization, advocating a free and open Internet. , the Internet Archive holds more than 39 million print materials, 11.6 million pieces of audiovisual content, 2.6 million software programs, 15 million audio files, 4.
EbookAn ebook (short for electronic book), also known as an e-book or eBook, is a book publication made available in digital form, consisting of text, images, or both readable on the flat-panel display of computers or other electronic devices. Although sometimes defined as "an electronic version of a printed book", some e-books exist without a printed equivalent. E-books can be read on dedicated e-reader devices, but also on any computer device that features a controllable viewing screen, including desktop computers, laptops, tablets and smartphones.
Google BooksGoogle Books (previously known as Google Book Search, Google Print, and by its code-name Project Ocean) is a service from Google that searches the full text of books and magazines that Google has scanned, converted to text using optical character recognition (OCR), and stored in its digital database. Books are provided either by publishers and authors through the Google Books Partner Program, or by Google's library partners through the Library Project. Additionally, Google has partnered with a number of magazine publishers to digitize their archives.
DigitizationDigitization is the process of converting information into a digital (i.e. computer-readable) format. The result is the representation of an object, , sound, document, or signal (usually an analog signal) obtained by generating a series of numbers that describe a discrete set of points or samples. The result is called digital representation or, more specifically, a , for the object, and digital form, for the signal.
Bibliographic databaseA bibliographic database is a database of bibliographic records. This is an organised online collection of references to published written works like journal and newspaper articles, conference proceedings, reports, government and legal publications, patents and books. In contrast to library catalogue entries, a majority of the records in bibliographic databases describe articles and conference papers rather than complete monographs, and they generally contain very rich subject descriptions in the form of keywords, subject classification terms, or abstracts.
Book scanningBook scanning or book digitization (also: magazine scanning or magazine digitization) is the process of converting physical books and magazines into digital media such as , electronic text, or electronic books (e-books) by using an . Large scale book scanning projects have made many books available online. Digital books can be easily distributed, reproduced, and read on-screen. Common file formats are DjVu, Portable Document Format (PDF), and (TIFF).
PublicationTo publish is to make content available to the general public. While specific use of the term may vary among countries, it is usually applied to text, images, or other audio-visual content, including paper (newspapers, magazines, catalogs, etc.). Publication means the act of publishing, and also any copies issued for public distribution. is a technical term in legal contexts and especially important in copyright legislation. An author of a work generally is the initial owner of the copyright on the work.
Link rotLink rot (also called link death, link breaking, or reference rot) is the phenomenon of hyperlinks tending over time to cease to point to their originally targeted , web page, or server due to that resource being relocated to a new address or becoming permanently unavailable. A link that no longer points to its target, often called a broken, dead, or orphaned link, is a specific form of dangling pointer. The rate of link rot is a subject of study and research due to its significance to the internet's ability to preserve information.
Web archivingWeb archiving is the process of collecting portions of the World Wide Web to ensure the information is preserved in an archive for future researchers, historians, and the public. Web archivists typically employ web crawlers for automated capture due to the massive size and amount of information on the Web. The largest web archiving organization based on a bulk crawling approach is the Wayback Machine, which strives to maintain an archive of the entire Web.
Legal depositLegal deposit is a legal requirement that a person or group submit copies of their publications to a repository, usually a library. The number of copies required varies from country to country. Typically, the national library is the primary repository of these copies. In some countries there is also a legal deposit requirement placed on the government, and it is required to send copies of documents to publicly accessible libraries. The legislation covering the requirement varies from country to country, but is often enshrined in copyright law.
JSTORJSTOR ('dʒeɪstɔːr; short for Journal Storage) is a digital library founded in 1994. Originally containing digitized back issues of academic journals, it now encompasses books and other primary sources as well as current issues of journals in the humanities and social sciences. It provides full-text searches of almost 2,000 journals. Most access is by subscription but some of the site is public domain, and open access content is available free of charge. William G. Bowen, president of Princeton University from 1972 to 1988, founded JSTOR in 1994.
Project GutenbergProject Gutenberg (PG) is a volunteer effort to digitize and archive cultural works, as well as to "encourage the creation and distribution of eBooks." It was founded in 1971 by American writer Michael S. Hart and is the oldest digital library. Most of the items in its collection are the full texts of books or individual stories in the public domain. All files can be accessed for free under an open format layout, available on almost any computer. , Project Gutenberg had reached 50,000 items in its collection of free eBooks.
Digital asset managementDigital asset management (DAM) and the implementation of its use as a computer application is required in the collection of digital assets to ensure that the owner, and possibly their delegates, can perform operations on the data files. The term media asset management (MAM) may be used in reference to Digital Asset Management when applied to the sub-set of digital objects commonly considered "media", namely audio recordings, photos, and videos.
ArchiveAn archive is an accumulation of historical records or materials – in any medium – or the physical facility in which they are located. Archives contain primary source documents that have accumulated over the course of an individual or organization's lifetime, and are kept to show the function of that person or organization. Professional archivists and historians generally understand archives to be records that have been naturally and necessarily generated as a product of regular legal, commercial, administrative, or social activities.
Digital rights managementDigital rights management (DRM) is the management of legal access to digital content. Various tools or technological protection measures (TPM) like access control technologies, can restrict the use of proprietary hardware and copyrighted works. DRM technologies govern the use, modification and distribution of copyrighted works (e.g. software, multimedia content) and of systems that enforce these policies within devices. DRM technologies include licensing agreements and encryption.
LibraryA library is a collection of books, and possibly other materials and media, that is accessible for use by its members and members of allied institutions. Libraries provide physical (hard copies) or digital access (soft copies) materials, and may be a physical location or a virtual space, or both. A library's collection normally includes printed materials which can be borrowed, and a reference section of publications which are not permitted to leave the library and can only be viewed inside the premises.
Aaron SwartzAaron Hillel Swartz (November 8, 1986January 11, 2013) was an American computer programmer, entrepreneur, writer, political organizer, and Internet hacktivist. As a programmer, Swartz helped develop the web feed format RSS; the technical architecture for Creative Commons, an organization dedicated to creating copyright licenses; the website framework web.py; and Markdown, a lightweight markup language format. Swartz was involved in the development of the social news aggregation website Reddit until he departed from the company in 2007.
Full-text searchIn text retrieval, full-text search refers to techniques for searching a single computer-stored document or a collection in a full-text database. Full-text search is distinguished from searches based on metadata or on parts of the original texts represented in databases (such as titles, abstracts, selected sections, or bibliographical references). In a full-text search, a search engine examines all of the words in every stored document as it tries to match search criteria (for example, text specified by a user).