Analytical Access to the Domain Dark Archive

Wednesday, 19 December 2012

Exploring and Uncovering British Eurosceptism in the Dark Archive

Here is another in our series of guest posts by those researchers who plan to use the archive for topics of particular interest to them:

Richard Deswarte - 'Exploring and Uncovering British Eurosceptism in the Dark Archive'

Britain's relationship to and subsequent engagement in the process of European integration is one of the most important political, economic and social developments of the last 50 years. This relationship has always been controversial even before the UK in 1973 joined the EEC, as it then was, and has certainly remained controversial ever since. The views and arguments of those individuals and groups who have opposed British membership, commonly referred to over the last twenty years as 'Euroscepticism' has been one of the enduring elements of British political and media debate. In the previous 15 years - exactly the period of the Web Domain dataset - much of this debate has been undertaken on the Web with many pro and anti-European groups setting up webpages and engaging in debates over the Web via blogs and other postings. To date there has been no dedicated research based on these online sites and debates. In conjunction with more traditional archival research that I am undertaking on British Euroscepticism, my AADDA project will take the opportunity to uncover and analyse the phenomenon of Euroscepticism on the Web.

In doing this research the following tools and digital research methods will be utilised. In the first instance I will engage in some Google style Ngram searching based on such key terms as Euroscepticism, EU, UKIP, Euroreferendum, etc. This should produce some interesting aggregate and qualitiative results, and patterns relating to volume, timing and variety of Websites and references. Following this I will undertake some proximity searching of related terms to see if this brings up different results and patterns. In addition I am keen to see what searching under images, as one can do in the current UK Web Archive, brings in terms of results given what I suspect will be a large number of images on these webpages. In addition if time allows it will be interesting to see if sentiment analysis can be applied to gauge the degrees of negativity of Webpages/websites and how successfully it can do so. Finally I will finish by undertaking some filtering of the results based on such elements as domain type and medium type to see what and if any interesting patterns emerge. At the same time I will be open to consider trying out some of the other tools and methods that the other researchers are finding particularly successful in their case studies.

Tuesday, 27 November 2012

Sentiment Analysis and the Reception of the Liverpool Poets

This is the latest in our series of guest posts from the AADDA researchers who are proposing ways in which the archive can inform their research. This post is from Helen Taylor of Royal Holloway:

I am currently writing a doctoral thesis entitled ‘Adrian Henri and Merseybeat poetry: performance, poetry, and public in the Liverpool Scene of the 1960s’ (at Royal Holloway, with Professor Robert Hampson). My work uses much archival research and oral memory, particularly in relation to the live event and oral poetry in Liverpool at the time.

Sentiment analysis of the Domain Dark Archive would be useful in relation to my work on the Liverpool Poets and their reception by not only the mainstream media but also by those who experienced their work at the time (in the form of memoir, via fan pages, forums, and the like), and as such could provide me with another area of information to consider alongside newspapers, interviews, and archival material.

My main proposal for the AADDA is for a small, self-contained, project involving proximity search. I have found in my research that a variety of labels have been attached to the poets, and I think it would be most interesting to see how Adrian Henri, Roger McGough, and Brian Patten are referred to in forums and similar (informal) internet sites. Henri is often referred to in academic material as a poet/painter, but I want to find out how ordinary people, for want of a better word, labelled him – and I will then combine and compare this data with searches for the same terms from newspaper and published works, as there is a marked difference in academic and popular attitudes to the poets.

Subsequent to this, I would like to run geo-indexing analysis, to see where (as well as who and when) these results are coming from. I would expect results within Liverpool, but it would be interesting to see where else is recorded. It would be particularly interesting to see if the Liverpool 8 postcode (which is where the poets were living and working) would be an area of memorialisation.

This project could be important for my research because I am approaching the literary movement from a multi-, inter-, and cross- media perspective, to present Merseybeat poetry as ‘total art’. In the archives in Liverpool there are flyers for events with a variety of labels for the poets (many of which were written by the poets themselves for events and tours), but I want to be able to provide evidence for how the people experiencing the work have categorised the poets and I think that proximity search will help me prove my thesis.

Helen Taylor

Monday, 12 November 2012

London French Geo-Indexing and Image Tagging

This is the third in our series of guest posts from researchers with proposals for how the domain dark archive can be interrogated. Saskia Huc-Hepher of the University of Westminster writes:

Calculating the precise number of French people living in the capital and specifying where they live within the sprawling city has to this day never been achieved. The French Embassy itself admits to its ignorance in this respect, stating that there are approximately 120 000 individuals registered at the French Consulate in London, but that they estimate the true number of French Londoners to be somewhere between 300 000 and 400 000. I have devised several strategies to try to determine with more certainty an accurate figure, from scrutinising the number of French-native speakers in London's state schools (by borough) to examining the quantities of French citizens registering for UK National Insurance cards (by year), and my next tactic is to consult the electoral rolls of each London constituency, pending the publication of the 2011 census data (which now includes questions on identity and language). Whilst I am aware of the limitations of a geo-indexing study, that is, that it will not provide a 'hard' figure for the specified period, my hope is that targeted searches might serve to triangulate my current findings. My aim is therefore to use the geo-indexing tool to map out the areas of London with the greatest concentrations of French inhabitants on the basis of the post-codes associated with 'French' web sites / spaces. This data would have the potential to confirm either the unexpected findings of the Francophone-schoolchildren investigation mentioned above (unexpected in that the borough with the highest number of French speakers was Lambeth, not Kensington and Chelsea as the stereotype might suggest) or, on the contrary, reinforce the stereotype, as depicted in a map reproduced by the Think London (A. Wlores) report which identified Kensington & Chelsea, Westminster, Hammersmith & Fulham and Wandsworth as having the largest concentrations of French residents. It would also have historical value in that it would ascertain whether or not there was any relationship between the areas most associated with the London French today and the areas favoured in previous waves of migration to the capital. The findings could then be used in the multi-layered e-resources referred to in the context of the aforementioned AHRC bid.

A study of this kind, focused on the French community in London, would be unprecedented and therefore make an entirely novel and original contribution to both academic and political spheres.

In addition to the 'physical' demographic mapping process I describe above, my doctoral research will also involve a multi-modal analysis of the French community websites selected for the Special Collection. Given the inherent and increasing multi-modality of the Internet, an ethnosemiotic approach to the examination of the London French web content would seem to be the most appropriate. My intention is to depict the visual landscape constructed by the French community websites and, using semiotic theory, attempt to infer meaning from the images and draw ethnographic conclusions regarding the community's sense of belonging; how they perceive and conceive London and its inhabitants; how they (re)present and define their own identity through images; what elements of France and Frenchness they portray and promote, etc. In order to give this visual study greater temporal contextualisation and depth, I intend to conduct a parallel micro-study on the Domain Dark Archive visual data using some kind of image-tagging analytical tool which would allow a word, or combination of words, such as 'French' and 'London', to search for photographs or images only that have been uploaded onto the (London French) websites contained in the archive. This study could also serve to triangulate the findings of the geo-indexing investigation in that the images and spaces associated with key words such as 'London', or specific areas within London, may overlap with the places and spaces that were identified as being particularly French through the geo-indexing process and/or historically. This investigation would therefore be binary in its objectives: visual data for both ethnosemiotic analysis and triangulation of geo-indexing data.

Further investigative mechanisms, comparable to the image-tagging search and analysis tool described above, could also be envisaged with the focus being on, by way of example, video or soundtracks. They were deemed, however, within the framework of the Domain Dark Archive, to be of reduced pertinence given that the earlier websites would undoubtedly contain less meaningful and more restricted data as a result of the technical constraints of the era. It is worthwhile considering such studies, nevertheless, for future scholarly research or AADDA pilots.

Saskia Huc-Hepher

Tuesday, 30 October 2012

PISA Rankings and public discourse

This is a guest post by another of the ADDAA researchers, Gemma Moss:

PISA Rankings and public discourse: Using the web domain dataset to explore how comparative statistical data have been used to set an agenda for educational change in the UK

The Programme for International Student Assessment (PISA), is a way of comparing educational performance in different countries, by testing students at age 15 when they are preparing to leave schooling for work. Conducted at three yearly intervals by the Organisation for Economic Co-operation and Development (OECD) since 2000, the latest round in 2012 involved 64 countries including all 34 OECD members. Since their inception the rank orderings of countries’ performance has acted as a major spur to educational reform in many jurisdictions, particularly countries which collect little performance data of their own. The findings are treated in national media as international league tables, with coverage in the UK focusing on our relative position (near to the mean) and whether we have risen or fallen in the rankings. This information often enters political discourse.

This project will use the potential of the web domain dataset to explore how reports of the the first four cycles of assessment in the PISA series (2000; 2003; 2006; 2009) were covered on the net. In particular the research aims are to identify:
• the kinds of institutions that gave most prominence to the PISA findings,
• how the findings were interpreted, and
• the extent to which they led to calls for system reform.

In addition, this project will explore whether the analytic tools offered for analysing the web domain dataset enhance or hinder this form of enquiry.

The research questions are:
1. Can the analytic tools suggested for use with the web domain archive help establish:
• Which kinds of institutions were mostly likely to comment on PISA data? (Newspapers; government agencies; universities; think-tanks; individuals in the blogosphere)
• How the data were represented and interpreted?
• What the data led to in terms of ideas for system change in the UK?

2. Do the analytic tools employed to answer 1. offer efficiencies of research time and scale in understanding the uptake and recontextualisation of research knowledge about PISA via the web and the knowledge communities it represents?

Gemma Moss, Institute of Education

Thursday, 18 October 2012

The Decline of Parliamentary Political Engagement, 2004-2010: implications for 2012 and beyond

This is a guest post by Carole Taylor, one of the researchers investigating the Domain Dark Archive as part of the AADDA project:

I am investigating the decline of Parliamentary political engagement in the UK since 2004, a trend documented in the Hansard Society’s annual Audit[s] of Political Engagement. Public attitudes to the political process have “hardened” in recent years; for example the number of people certain that they will vote in a national election has dropped to an all-time low of 48%. My particular interest is in the impact of the work of MPs and peers in the Westminster Parliament, on public opinion; I want to be clearer about the links between political engagement and what Parliament does.

In my research proposal to this consultation, I suggested four questions that the Domain Dark Archive might address:

One: could we identify websites addressing some or all of the core indicators of political engagement (ie, knowledge and interest, action and participation, and efficacy and satisfaction)?
Two: could comparison searches be done to give parliamentarians an insight into changing public perceptions of the parliamentary process?
Three: can social media forums used by parliamentarians be identified in a time-sensitive way that highlights political themes commented on from one year to the next?
And four: could we examine the House of Lords blog, say, to analyse how politicians – peers in this case – engaged with the spontaneous, seldom thought-through but increasingly influential eruptions of public opinion expressed in tweets and blogs?

Given the limited amount of time we will have with the dataset this spring, I plan to focus on the last two questions, using the House of Lords as a case study not least because the Lords was the first parliamentary chamber in the world to set up a bipartisan blog (in 2008). Many peers comment on other blogs as well, and it will be interesting to chart how a discrete group of peers and public have interacted online during a period of decline in so-called political engagement. Between now and the spring I will interview peers with an interest in social media in order to identify why they got involved in blogging in the first place. This research will give me relevant key words and phrases to submit to the DDA consultation for search and analysis.

Dr Carole Taylor BSc, MA, PhD
taylorcm@parliament.uk

Friday, 25 May 2012

Workshop 3: for humanities and social scientists

Bookings for this event are now closed.

Could you imagine what research questions you might be able to answer using a comprehensive archive of UK websites for the period 1996 to 2010 ? If so, this workshop may be for you, and bookings are now open. It offers an introduction to the Domain Dark Archive, a unique new research dataset, purchased by the JISC from the Internet Archive, in the keeping of the British Library, and not yet publicly available.

The workshop affords a unique opportunity to learn about the DDA, and to help shape the development of the new user interface for the data. The results of these workshops will directly influence the development of the search, analysis and display tools for the new service.

Where: British Library (St Pancras, London)
When: Wednesday 13 June, 10.50am - 3.15pm

Sessions include:

Introducing the UK Web Archive at the BL, and the Domain Dark Archive in particular;
What is analytical access anyway, and what could it do for me ?
Case study: How one scholar might use the Domain Dark Archive [see earlier post for a preview]

To book a place, contact the project manager, Dr Peter Webster, at Peter.Webster@sas.ac.uk, with a brief statement of your research interests.Booking is free, but places are very limited.

The event will be most suitable for scholars at doctoral level or higher, but should not be viewed as introductory research training.
A sandwich lunch will be served, and we will also be able to reimburse reasonable travel expenses within the UK.

Workshop 2: arts and humanities

Bookings for this event are now closed.

Could you imagine what research questions you might be able to answer using a comprehensive archive of UK websites for the period 1996 to 2010 ? If so, this workshop may be for you, and bookings are now open. It offers an introduction to the Domain Dark Archive, a unique new research dataset, purchased by the JISC from the Internet Archive, in the keeping of the British Library, and not yet publicly available.

The workshop affords a unique opportunity to learn about the DDA, and to help shape the development of the new user interface for the data. The results of these workshops will directly influence the development of the search, analysis and display tools for the new service.

Where: British Library (St Pancras, London)
When: Tuesday 12 June, 10.50am - 3.00pm

Sessions include:

Introducing the UK Web Archive at the BL, and the Domain Dark Archive in particular;
What is analytical access anyway, and what could it do for me ?
Case study: How one scholar might use the Domain Dark Archive [see earlier post for a preview]

To book a place, contact the project manager, Dr Peter Webster, at Peter.Webster@sas.ac.uk, with a brief statement of your research interests.
Booking is free, but places are very limited.

The event will be most suitable for scholars at doctoral level or higher, but should not be viewed as introductory research training.
A sandwich lunch will be served, and we will also be able to reimburse reasonable travel expenses within the UK.