Turning a bibliography into a network analysis

If you have read my project page, you know my plan is to do a multi-faceted analysis of the community-based archaeology movement for my dissertation, focusing on the role of institutions in shaping archaeological thought. Currently I’m working on the social network analysis facet. Although I have some experiences creating databases and with tools to transform the data in them to new formats, I have not done network analyses before or worked with large bibliographic databases. This post is a new version of my notes on how I’m going about this. As anyone who works in a technical field will tell you that every project plan is a work in progress until it’s done, and this most momentous of projects is no exception.


Types of contributions I’m mining: (more about this topic here)

  • book
  • chapter in edited volume
  • journal article
  • blog post
  • web page
  • article for popular media
  • presentation to the public or at a professional conference
  • posters
  • outreach materials
  • document shared online by author
  • public grant applications
  • publicized grantee lists



Nodes = people & institutions

From bibliography spreadsheet, extract:

  • list of names
  • list of institutions (e.g. presses, grant agencies, universities)
  • co-authorship relationship?
  • type of connection (as demonstrated in the item)
    • employee/employer
    • co-author
    • grantee/grantor
    • publisher/author
    • more stuff I can’t think of right now I’m sure

Tracking connections between people with data on those connections held in database

Use filemaker to reorganize and export the info in a format with counts, name of field


Current tables in database:

1. Bibliography with all fields, themes chosen by me, themes chosen by McDavid et al

2. Project list: master list of projects, PIs, geographic location if possible (to make all the records of public products are attached by project)



Current SNA activities:

1. Double-checking that every entry fits the research plan (power restructuring in the research encounter as a goal) – 50% done, many articles in process of being obtained

2. Systematically comb through remaining publications, compiled bibliographies for sources 1990-2014 (then stop, for God’s sake!)

3. Begin contacting granting agencies and other institutions for potential sources to add

4. Social network analysis tutorials by Shawn Graham (and other relevant technical skills from his course) – 50% done

5. Create author (node) list by extracting info from bibliography data

6. Create institution (node) list by extracting info from bibliography data


Those last two are tonight’s activity. Navigating all those commas, ampersands, and my beloved Society for American Archaeology’s citation format should make it extra interesting!

Leave a comment