DBpediaDBpedia (from "DB" for "database") is a project aiming to extract structured content from the information created in the Wikipedia project. This structured information is made available on the World Wide Web. DBpedia allows users to semantically query relationships and properties of Wikipedia resources, including links to other related datasets. In 2008, Tim Berners-Lee described DBpedia as one of the most famous parts of the decentralized Linked Data effort.
Knowledge graphIn knowledge representation and reasoning, knowledge graph is a knowledge base that uses a graph-structured data model or topology to integrate data. Knowledge graphs are often used to store interlinked descriptions of entities - objects, events, situations or abstract concepts - while also encoding the semantics underlying the used terminology. Since the development of the Semantic Web, knowledge graphs are often associated with linked open data projects, focusing on the connections between concepts and entities.
WiktionaryWiktionary (UKˈwɪkʃənəri, ; USˈwɪkʃənɛri, ; rhyming with "dictionary") is a multilingual, web-based project to create a free content dictionary of terms (including words, phrases, proverbs, linguistic reconstructions, etc.) in all natural languages and in a number of artificial languages. These entries may contain definitions, s for illustration, pronunciations, etymologies, inflections, usage examples, quotations, related terms, and translations of terms into other languages, among other features.
TriplestoreA triplestore or RDF store is a purpose-built database for the storage and retrieval of triples through semantic queries. A triple is a data entity composed of subject–predicate–object, like "Bob is 35" or "Bob knows Fred". Much like a relational database, information in a triplestore is stored and retrieved via a query language. Unlike a relational database, a triplestore is optimized for the storage and retrieval of triples. In addition to queries, triples can usually be imported and exported using Resource Description Framework (RDF) and other formats.
Wikimedia CommonsWikimedia Commons (or simply Commons) is a media repository of free-to-use images, sounds, videos and other media. It is a project of the Wikimedia Foundation. Files from Wikimedia Commons can be used across all of the Wikimedia projects in all languages, including Wikipedia, Wikivoyage, Wikisource, Wikiquote, Wiktionary, Wikinews, Wikibooks, and Wikispecies, or downloaded for offsite use. As of February 2023, the repository contains over 90 million free-to-use media files, managed and editable by registered volunteers.
YAGO (database)YAGO (Yet Another Great Ontology) is an open source knowledge base developed at the Max Planck Institute for Informatics in Saarbrücken. It is automatically extracted from Wikipedia and other sources. As of 2019, YAGO3 has knowledge of more than 10 million entities and contains more than 120 million facts about these entities. The information in YAGO is extracted from Wikipedia (e.g., categories, redirects, infoboxes), WordNet (e.g., synsets, hyponymy), and GeoNames. The accuracy of YAGO was manually evaluated to be above 95% on a sample of facts.
Wikimedia FoundationThe Wikimedia Foundation, Inc. (WMF) is an American 501(c)(3) nonprofit organization headquartered in San Francisco, California, and registered as a charitable foundation under local laws. Best known as the hosting platform for Wikipedia, a crowdsourced online encyclopedia, it also hosts other related projects and MediaWiki, a wiki software. The Wikimedia Foundation was established in 2003 in St. Petersburg, Florida, by Jimmy Wales as a nonprofit way to fund Wikipedia, Wiktionary, and other crowdsourced wiki projects that had until then been hosted by Bomis, Wales's for-profit company.
Graph databaseA graph database (GDB) is a database that uses graph structures for semantic queries with nodes, edges, and properties to represent and store data. A key concept of the system is the graph (or edge or relationship). The graph relates the data items in the store to a collection of nodes and edges, the edges representing the relationships between the nodes. The relationships allow data in the store to be linked together directly and, in many cases, retrieved with one operation.
WikisourceWikisource is an online digital library of free-content textual sources on a wiki, operated by the Wikimedia Foundation. Wikisource is the name of the project as a whole and the name for each instance of that project (each instance usually representing a different language); multiple Wikisources make up the overall project of Wikisource. The project's aim is to host all forms of free text, in many languages, and translations.
Resource Description FrameworkThe Resource Description Framework (RDF) is a World Wide Web Consortium (W3C) standard originally designed as a data model for metadata. It has come to be used as a general method for description and exchange of graph data. RDF provides a variety of syntax notations and data serialization formats, with Turtle (Terse RDF Triple Language) currently being the most widely used notation. RDF is a directed graph composed of triple statements.
Knowledge baseA knowledge base (KB) is a set of sentences, each sentence given in a knowledge representation language, with interfaces to tell new sentences and to ask questions about what is known, where either of these interfaces might use inference. It is a technology used to store complex structured data used by a computer system. The initial use of the term was in connection with expert systems, which were the first knowledge-based systems. The original use of the term knowledge base was to describe one of the two sub-systems of an expert system.
Wikimedia movementThe Wikimedia movement is the global community of contributors to the Wikimedia projects, including Wikipedia. This community directly builds and administers these projects with the commitment of achieving this using open standards and software. First created around and by Wikipedia's community of volunteer editors (Wikipedians), it has since expanded to other projects like Wikimedia Commons and Wikidata and volunteer software engineers and developers contributing to the software used to power Wikimedia, MediaWiki.
Linked dataIn computing, linked data is structured data which is interlinked with other data so it becomes more useful through semantic queries. It builds upon standard Web technologies such as HTTP, RDF and URIs, but rather than using them to serve web pages only for human readers, it extends them to share information in a way that can be read automatically by computers. Part of the vision of linked data is for the Internet to become a global database.
MetadataMetadata (or metainformation) is "data that provides information about other data", but not the content of the data, such as the text of a message or the image itself. There are many distinct types of metadata, including: Descriptive metadata – the descriptive information about a resource. It is used for discovery and identification. It includes elements such as title, abstract, author, and keywords. Structural metadata – metadata about containers of data and indicates how compound objects are put together, for example, how pages are ordered to form chapters.
WikipediaWikipedia is a free-content online encyclopedia written and maintained by a community of volunteers, collectively known as Wikipedians, through open collaboration and using a wiki-based editing system called MediaWiki. Wikipedia is the largest and most-read reference work in history, and has consistently been one of the 10 most popular websites. Created by Jimmy Wales and Larry Sanger on January 15, 2001, it is hosted by the Wikimedia Foundation, an American nonprofit organization.
English WikipediaThe English Wikipedia is the primary English-language edition of Wikipedia, an online encyclopedia. It was created by Jimmy Wales and Larry Sanger on January 15, 2001, as Wikipedia's first edition. English Wikipedia is hosted alongside other language editions by the Wikimedia Foundation, an American non-profit organization. Its content is written independently of other editions in various varieties of English, aiming to stay consistent within articles. Its internal newspaper is The Signpost.
MediaWikiMediaWiki is free and open-source wiki software originally developed by Magnus Manske for use on Wikipedia on January 25, 2002 and further improved by Lee Daniel Crocker, after which it has since been coordinated by the Wikimedia Foundation. It powers most websites hosted by the Foundation including Wikipedia, Wiktionary, Wikimedia Commons, Wikiquote, Meta-Wiki and Wikidata, which define a large part of the set requirements for the software. MediaWiki is written in the PHP programming language and stores all text content into a database.
Entity linkingIn natural language processing, entity linking, also referred to as named-entity linking (NEL), named-entity disambiguation (NED), named-entity recognition and disambiguation (NERD) or named-entity normalization (NEN) is the task of assigning a unique identity to entities (such as famous individuals, locations, or companies) mentioned in text. For example, given the sentence "Paris is the capital of France", the idea is to determine that "Paris" refers to the city of Paris and not to Paris Hilton or any other entity that could be referred to as "Paris".