Maxime Gabella

Knowledge is useless without structure. While the classification of knowledgehas been an enduring philosophical enterprise, it recently found applicationsin computer science, notably for artificial intelligence. The availability oflarge databases allowed for complex ontologies to be built automatically, forexample by extracting structured content from Wikipedia. However, this approachis subject to manual categorization decisions made by online editors. Here weshow that an implicit classification system emerges spontaneously on Wikipedia.We study the network of first links between articles, and find that it centerson a core cycle involving concepts of fundamental classifying importance. Weargue that this structure is rooted in cultural history. For Europeanlanguages, articles like Philosophy and Science are central, whereas Human andEarth dominate for East Asian languages. This reflects the differences betweenancient Greek thought and Chinese tradition. Our results reveal the powerfulinfluence of culture on the intrinsic architecture of complex data sets.

location

arxiv.org

Advertisements