Digital Humanities at the Center for Religious Studies
CERES Computer Café: Retrieving and extracting information from the Web
The Web is now a regular source of information. Also academic research projects or other information providers regularly publish information and data through their web sites. However, the ways in which information is provided on the web often makes it difficult to actually access and use it for one’s own research. As a result, much time is spent copying and pasting information from web pages and databases into spreadsheets, which can more easily be used for research. The technique of web scraping takes much of the time consuming, repetitive tasks and automates them, in turn accelerating the process.