tag:blogger.com,1999:blog-24756248956367454902024-02-07T22:34:57.142+00:00Analytical Access to the Domain Dark ArchiveDeveloping new forms of access to a dark archive of UK websites (1996-2010). Funded by the JISC, led by the Institute of Historical Research in partnership with the British Library and the University of Cambridge.Peter Websterhttp://www.blogger.com/profile/11658752319509408253noreply@blogger.comBlogger17125tag:blogger.com,1999:blog-2475624895636745490.post-38773113715325470882014-05-09T12:34:00.001+01:002014-05-09T12:34:55.867+01:00Researchers' final reports (4)In another of our series of researchers' final reports, I am posting a <a href="https://drive.google.com/file/d/0B1SQsiDbKNdoZ1dId0JJZ1RvcDg/edit?usp=sharing" target="_blank">link to a PDF</a> of a talk given by <a href="http://www.lshtm.ac.uk/aboutus/people/gorsky.martin" target="_blank">Martin Gorsky</a> of the London School of Hygiene and Tropical Medicine at the recent <a href="https://esshc.socialhistory.org/esshc-vienna-2014" target="_blank">European Social Science History Conference</a> in Vienna. Martin goes into plenty of detail here about how he used the search interface to the Dark Archive to research public health in local government in England.Jonathan Blaneyhttp://www.blogger.com/profile/15856886701364691512noreply@blogger.com0tag:blogger.com,1999:blog-2475624895636745490.post-35433653929229548312013-11-12T14:41:00.001+00:002013-11-13T13:43:46.543+00:00Researchers' final reports (3)This is the third in our series of final reports by the AADDA project researchers, posted with their permission. This one is by Dr Carole Taylor, a researcher at the House of Lords:<br />
<br />
<br />
<h3>
I.<span class="Apple-tab-span" style="white-space: pre;"> </span>Research Background and Methodology </h3>
<br />
My historical expertise lies in early Georgian music, art and politics and was not obviously suited to the Domain Dark Archive focus on UK websites extant between 1996 and 2010. However, my work as Research and Parliamentary Assistant to peers in the House of Lords seemed a more promising fit. I discussed this with colleagues in the Lords who immediately recognised the potential value of the web archive for MPs and Peers with “a range of policy interests which will map onto those of academic researchers”. With the particular encouragement and advice of Dr Elizabeth Hallam Smith (Director of Information Services and Librarian, House of Lords Library) I identified political engagement as an area of obvious interest to parliamentarians, as well as a theme noted by Peter Webster during the 13 June 2012 seminar as a category, among others, that lent itself well to web archive research of this kind, <a href="https://drive.google.com/file/d/0B1SQsiDbKNdodDdncm9mSjZHdGM/edit?usp=sharing" target="_blank">and wrote up a proposal</a>.<br />
<br />
I undertook an intensive period of research to familiarise myself with the present state of serious research on political engagement in the UK in order to identify a manageable research exercise to take to the AADDA interface. I was advised by several academic colleagues, particularly two PhD students in <a href="http://www.essex.ac.uk/government/">the Department of Government at the University of Essex</a>, one of the two main centres (together with the University of Lancaster) of political engagement studies in the UK. I was also assisted in this information gathering exercise by a Senior Researcher at the House of Lords Library where serious efforts are made to understand how parliamentarians are listening to and engaging with the public.<br />
<br />
In advance of our access to the AADDA I presented <a href="https://drive.google.com/file/d/0B1SQsiDbKNdoZUJObjVGeC1oSlk/edit?usp=sharing" target="_blank">a scaled-down version of my research proposal</a><span id="goog_2029509079"></span><span id="goog_2029509080"></span><a href="http://www.blogger.com/"></a> to the IHR/BL team in March. I suggested a focus on social media forums used by parliamentarians, particularly <a href="http://lordsoftheblog.net/" target="_blank">the House of Lords blog</a>, launched in 2008. The House of Lords was the first parliamentary chamber in the world to set up a bipartisan blog which makes it a compelling example in the history of political engagement. Disappointingly, in the ensuing months leading up to our encounter with the AADDA dataset, I learned that social media sites with the exception of .co.uk would <i>not </i>be included in the dataset, which meant my topic was no longer viable. I re-thought the proposal and decided on a very narrow, entirely new subject that felt manageable to complete within the parameters of the consultation – Heathrow’s Third Runway. In February 2013 I had a meeting with Jane Winters and Jonathan Blaney at the IHR to confirm this third version of the research proposal was acceptable (it was).<br />
<br />
Thus I was on track for the purposes of the consultation. However, with such limited access to social media sites the value of this exercise for serious researchers at Parliament was considerably eroded. Even the significance of the results below on the “Third Runway” was questioned, albeit sympathetically, by parliamentarians who cautioned that I appeared to be accessing information that is already well-known to parliamentary researchers. Their interest is obviously about what this resource can offer over and above what they know already. It may be that areas that did not receive such widespread public airing (such as the Third Runway did) will deliver better results.<br />
<br />
<h3>
<br />II.<span class="Apple-tab-span" style="white-space: pre;"> </span>Research Results: </h3>
<br />
1st session: March 2013<br />
<br />
I questioned the interface three times.<br />
<br />
<ol>
<li>“third runway” – 171 items; </li>
<li>“third runway” AND “parliament” – 71 items; </li>
<li>“third runway” AND “heathrow” and “parliament” – 69 items</li>
</ol>
<br />
<br />
Yield:<br />
<br />
<ul>
<li>a lot of travel companies; </li>
<li>.gov.uk (5 items) – entirely predictable; </li>
<li>public suffixes important to investigate engagement, but I didn’t readily grasp how usefully to link left and right side of the search results page</li>
</ul>
<br />
<br />
<b>Questions arising:</b><br />
How many of the 100 that were dropped between first two searches might have included useful information? In this respect, I agree with GM at the 21 March 2013 meeting who said “there needs to be a ‘search within’ option, for when there are many thousands of results.” PW’s response that “in such cases adding more search terms should have the same effect” is helpful to reduce “many thousands” to a couple of hundred; however, at this point I might not want to lose potentially useful information in the course of adding a new search term.<br />
<br />
What about people who are undecided or don’t express their views? an obvious but important qualitative question for historical researchers.<br />
<br />
<b>Suggestions:</b><br />
It would be a great help if a preview screen were available to the right of each item. Through all my searches (in March and September), I clicked on countless items that were duplicates of what I’d clicked on two or three items earlier. (Titles of items often differ, so titles alone are not a dependable indicator.)<br />
<br />
At the March meeting I asked Peter and Andrew how to turn the search data into ngrams; the answer was that AADDA will have a “<b>click to create ngram” function</b> – not there yet: would be a great help<br />
<br />
2nd session: September 2013<br />
<br />
I questioned the interface 10 or 15 times.<br />
<br />
<ol>
<li>“third runway” AND “parliament” – yielded 990 items, but breakdown of these results (crawl year, content type etc) proved both manageable and useful; </li>
<li>“third runway” AND “soley” – 122 items: Lord Soley was Chairman of “Future Heathrow”, the pro-expansion group; among the 122 items was an interview (helpful, though repeated twice); the first 21 items were all the same and most were inaccessible or gobbledygook (cooking recipes); several references had no mention of Soley or third runway at all, eg, travel sites (nothing to do with Soley and no mention of his name)</li>
<li>“third runway” AND “house of lords” – 206 items; and “third runway” AND “aviation” – 2000+. For both these, I checked out two extremes ends of sentiment analysis (“very positive” and “very negative”). Many of these items failed to link the two search filters in any way. Nearly all of the 206 items in the first set were bbc – this was not only a problem of repetition (though there was plenty of this), but these are also widely public documents of little use to parliamentarians (who are already well-equipped with knowledge at this level).</li>
<li>I checked “third runway” AND “Howard Davies” on the off chance he was mentioned in this connection before he became Chairman of the Airports Commission in 2008 – eight items, all identical (a pdf report of the Association of British Insurers that had no mention of “third runway” or Davies) – disappointing! </li>
<li>Also checked “third runway” AND “future of aviation”; “third runway” AND “environment”; “third runway” AND “economy” – no new observations.</li>
</ol>
<br />
<div>
<br /></div>
<b>Suggestions:</b><br />
It would be a great help if we could print the page with search results (or somehow export this material).<br />
<br />
<b>Questions and Concerns:</b><br />
Clearly in this September round of questioning the dataset I was encountering problems with the <b>Boolean AND search</b> that didn’t arise in March. At best I seemed in September to be accessing OR rather than AND; at worst there was no connection to either search filter. I corresponded with Richard Deswarte about this and he could not see where the problem lay and I have no idea what the problem was either.<br />
<br />
<b>Sentiment Analysis</b>, where it hits items to do with the search term(s), was at least consistent and might therefore be of interest in early stages of research.<br />
<br />
<b>Repetition</b>: This is my biggest concern about keyword searching: does the repetition of material occurring from one crawl to the next, render the number totals listed on the search results meaningless for the historian? And will this problem be multiplied by 200 when the entire dataset is available? Peter cautioned users to avoid taking numbers of results for 2009 and 2010 as evidence of patterns in relation to the previous years; does repetition compound this problem for all years?<br />
<br />
<br />
<h3>
III.<span class="Apple-tab-span" style="white-space: pre;"> </span>Concluding Remarks</h3>
<br />
The Digital History and Archives seminar presented by Peter Webster and Richard Deswarte at the IHR on 23 September 2013 was an invaluable guide to my second round of searches on the interface: <a href="http://historyspot.org.uk/podcasts/digital-history/web-archives-new-class-pr">http://historyspot.org.uk/podcasts/digital-history/web-archives-new-class-pr</a> - click on “Web Archives: A New Class of Primary Source for Historians?” I’d particularly like to highlight Peter’s observation that the traditional separation of historian and keeper of archives no longer holds in digitized systems of this kind. During the Q&A Tim Hitchcock expanded on this point, remarking that models of society now being digitized – newspapers, etc – were of course not digitized at the time. These changes demand a new skillset now being shaped by and for C21 historians. To this I would add that scholars will have questions about subjects they know well and subjects they are addressing for the first time, and this fact needs also to be built into the process of curating datasets of this kind – particularly in the present, pioneering stage of digital research.<br />
<br />Jonathan Blaneyhttp://www.blogger.com/profile/15856886701364691512noreply@blogger.com0tag:blogger.com,1999:blog-2475624895636745490.post-22394950421758372342013-10-23T16:57:00.000+01:002013-10-23T16:57:00.413+01:00Researchers' final reports (2)This is the second in our series of final reports by the AADDA project researchers, posted with their permission. This one is by Saskia Huc-Hepher:<br />
<br />
<br />
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<br /></div>
<div align="center" class="MsoNormal" style="text-align: center;">
<b><span style="font-family: "Arial","sans-serif"; font-size: 14.0pt; mso-bidi-font-family: Tahoma;">AADDA Testing Report: <o:p></o:p></span></b></div>
<div align="center" class="MsoNormal" style="text-align: center;">
<b><span style="font-family: "Arial","sans-serif"; font-size: 14.0pt; mso-bidi-font-family: Tahoma;">The French Community in London<o:p></o:p></span></b></div>
<div align="center" class="MsoNormal" style="text-align: center;">
<b><span style="font-family: "Arial","sans-serif"; font-size: 14.0pt; mso-bidi-font-family: Tahoma;">by Saskia Huc-Hepher<o:p></o:p></span></b></div>
<div align="center" class="MsoNormal" style="text-align: center;">
<br /></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<br /></div>
<table border="0" cellpadding="0" cellspacing="0" class="MsoNormalTable" style="border-collapse: collapse; margin-left: 2.75pt; mso-padding-alt: 2.75pt 2.75pt 2.75pt 2.75pt; mso-table-layout-alt: fixed;">
<tbody>
<tr>
<td style="border: solid black 1.0pt; mso-border-alt: solid black .25pt; padding: 2.75pt 2.75pt 2.75pt 2.75pt; width: 482.0pt;" valign="top" width="643">
<div class="MsoNormal">
<b><span style="font-family: "Arial","sans-serif"; font-size: 14.0pt; mso-bidi-font-family: Tahoma;">1 -
Methodology<o:p></o:p></span></b></div>
</td>
</tr>
</tbody></table>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<span style="font-family: Arial, sans-serif;">The initial purpose of this research was two-fold: firstly,
to use the geo-indexing tool to map out the areas of London with the greatest
concentrations of French inhabitants on the basis of the post-codes associated
with 'French' Web sites / spaces; and, secondly, to identify </span><span style="font-family: Arial, sans-serif;">French community websites in the
Domain Dark Archive (DDA) appropriate for subsequent multimodal analysis on the
basis their visual and textual meaning potentialities. The ultimate objective
of the former was to triangulate the
findings of additional empirical research conducted within the framework of my
PhD, which sought to ascertain the actual numbers and hot-spots of the London
French community, thereby serving to dispel the exclusively, or at least
predominantly, South Kensington myth. Whilst the aim of the latter was to
scrutinise the visual landscape of the London French over the period of the DDA
data set, as (re)presented through the images – still or moving, in parallel to
the technological advances of the Internet – displayed on the French community
websites found in the DDA. It was envisaged that this historical visual data
would provide the study with greater temporal contextualisation and depth, and,
using social semiotic theory, in particular multimodality, would allow meaning
to be inferred and ethnographic conclusions drawn from the images, on such
subjects as the community's sense of belonging; how they perceive and conceive
London and its inhabitants; how they (re)present and define their own identity
through images; what elements of France and Frenchness they portray and
promote; and whether any of these have changed over time.<o:p></o:p></span></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<span style="font-family: Arial, sans-serif;">Similarly, it was
hoped that the geo-indexing analysis would be of historical value, determining
whether or not there was any relationship between the areas most associated
with the London French today and those districts favoured in previous waves of
migration to the capital. <o:p></o:p></span></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<span style="font-family: Arial, sans-serif;">The final objective
of the DDA research proposed here was for the image-tagging analytical tool to
enable a word, or combination of words, such as 'French' and 'London', to
search for photographs or images only, the visual data thereby potentially
serving to triangulate the findings of the geo-indexing investigation in that
the images and spaces associated with key words such as 'London', or specific
areas within London, could have coincided with the places and spaces that were
identified as being particularly French through the geo-indexing process and/or
historically. This micro-investigation was therefore to be binary in its
objectives: visual data for ethnosemiotic analysis and geo-indexing data for
triangulation of previous qualitative research.<o:p></o:p></span></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<span style="font-family: Arial, sans-serif;">The methodology
outlined above was adopted on several occasions over the course of the AADDA
project time-span: firstly in March 2013, later in August 2013 and September
2013, with a final trial, using the most functional interface and comprehensive
data set, in October 2013. The results, at every stage, however, were
disappointing. <o:p></o:p></span></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<br /></div>
<table border="0" cellpadding="0" cellspacing="0" class="MsoNormalTable" style="border-collapse: collapse; margin-left: 2.75pt; mso-padding-alt: 2.75pt 2.75pt 2.75pt 2.75pt; mso-table-layout-alt: fixed;">
<tbody>
<tr>
<td style="border: solid black 1.0pt; mso-border-alt: solid black .25pt; padding: 2.75pt 2.75pt 2.75pt 2.75pt; width: 482.0pt;" valign="top" width="643">
<div class="MsoNormal">
<b><span style="font-family: Arial, sans-serif; font-size: 14pt;">2 – Deep Search Data Testing <o:p></o:p></span></b></div>
</td>
</tr>
</tbody></table>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<b><u><span style="font-family: Arial, sans-serif;">March 2013<o:p></o:p></span></u></b></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<span style="font-family: Arial, sans-serif;">The first trial session was carried out in the knowledge
that at that point in time the DDA included only a random subset of the entire
cohort of data, but one which was evenly spread over the archive in temporal
terms. Therefore, in theory, trends, developments and patterns should have been
identifiable, despite sentiment analysis and geographic options not being
available at that stage. In practice, however, a number of basic search hurdles
prevented any valuable findings from materialising. These included:<o:p></o:p></span></div>
<div class="MsoNormal">
<br /></div>
<ul style="margin-top: 0cm;" type="disc">
<li class="MsoNormal"><span style="font-family: "Arial","sans-serif"; mso-bidi-font-family: Tahoma;">the
lack of clarity regarding the need to click on the crawl date to access a
website; choosing the website title would have been more intuitive. Such
functionality was updated at the subsequent meeting (21/03/2013);<o:p></o:p></span></li>
<li class="MsoNormal"><span style="font-family: "Arial","sans-serif"; mso-bidi-font-family: Tahoma;">the
lack of clarity regarding the purpose of the bar charts at the top of the
page; they have since been removed;<o:p></o:p></span></li>
<li class="MsoNormal"><span style="font-family: "Arial","sans-serif"; mso-bidi-font-family: Tahoma;">the
fact that not all web captures functioned at that time – e.g. <i>Le Petit
Parisien </i>restaurant had no images and almost no text (but enabled me
to do a current Google search for the website, only to find out that the
restaurant – and website – is now closed; this is therefore an example of
the potential historical worth of the DDA, had it been operating
correctly, in allowing the analysis of obsolete Websites); <o:p></o:p></span></li>
<li class="MsoNormal"><span style="font-family: "Arial","sans-serif"; mso-bidi-font-family: Tahoma;">some
websites cited in the list of 'hits' subsequently being found to be
unavailable; the links to alternative sites proved to be useful, however;<o:p></o:p></span></li>
<li class="MsoNormal"><span style="font-family: "Arial","sans-serif"; mso-bidi-font-family: Tahoma;">time
being wasted revisiting Websites which had already been scrutinised. Once a
site has been viewed, it would be helpful and more time-efficient if the
visited link appeared in a different colour (e.g. purple, cf. Google) from
the others on the list;<o:p></o:p></span></li>
<li class="MsoNormal"><span style="font-family: "Arial","sans-serif"; mso-bidi-font-family: Tahoma;">the
fact that search tools operated extremely slowly and the interface was not
yet user-friendly. Speeds and appearance have since improved and the
latter is no doubt a work in progress;<o:p></o:p></span></li>
<li class="MsoNormal"><span style="color: windowtext;"><a href="http://web.archive.org/web/20080601000000*/http:/www.guardian.co.uk/world/2008/jul/12/france.islam"><span style="font-family: "Arial","sans-serif"; mso-bidi-font-family: Tahoma;">http://web.archive.org/web/20080601000000*/http://www.guardian.co.uk/world/2008/jul/12/france.islam</span></a></span><span style="font-family: "Arial","sans-serif"; mso-bidi-font-family: Tahoma;">
Here, every separate date in the July (burka scandal) peak (as well as all
the other dates in August and October 2008, the two snapshots available
from 2009 and the single one from 2012) showed the <i>same</i> snapshot
from <i>The Guardian </i>(12 July 2008). If the online material is
unchanged in relation to another date, this should be immediately visible
on the list of data (possibly via colour coding, as suggested for the
pre-visited Web pages, or grouping by content & date);<o:p></o:p></span></li>
<li class="MsoNormal"><span style="font-family: "Arial","sans-serif"; mso-bidi-font-family: Tahoma;">the
majority of search results not being particularly useful for my purposes;
they were either not relevant (for instance displaying large numbers of
Websites related to French tourism for English users) or not
French-specific (that is, 'Londres' retrieved results in Portuguese,
Spanish, etc., not French exclusively; while English search words
retrieved sites aimed at Francophiles as opposed to Francophones);<o:p></o:p></span></li>
<li class="MsoNormal"><span style="font-family: "Arial","sans-serif"; mso-bidi-font-family: Tahoma;">phrase
searching using the “double inverted commas” being equally disappointing
(nothing of relevance was found following a search for “French community
London”, or indeed '“French” and “community”', trialled at a later stage);
“French London” was therefore tested, resulting in a list of sites
relating to French teachers & jobs in London.<o:p></o:p></span></li>
</ul>
<div class="MsoNormal" style="margin-left: 18.0pt;">
<br /></div>
<div class="MsoNormal" style="margin-left: 18.0pt;">
<span style="font-family: Arial, sans-serif;">Conversely, it was useful to have the
'media' / 'pdf' search options at the bottom of the screen, as this enabled
access to images and audio 'texts' (of relevance to the multimodal methodological
/ theoretical approach taken in my research);<o:p></o:p></span></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<span style="font-family: Arial, sans-serif;">Overall, the initial testing was found to be useful in
assessing the lasting impact, or otherwise, of the French community on London,
in a temporally comparative manner. That is, by identifying French restaurants/cafés/businesses
through their retrospective on-line presence before submitting the titles to a
live Google search at the time of testing, I was able to discover if such
enterprises were growing, in decline or defunct. Whilst that limited use was of
potential value to my research in assessing the lasting contribution of French
businesses to London's cultural and economic landscape, I was nevertheless
acutely aware (given my curation of the London French Special Collection for
the UK Web Archive) of the mass of relevant data – such as community websites
and blogs – which had not been detected or listed as featuring in the DDA. It
was hoped at the time that this was due to the incomplete and arbitrary state
of the data set.<o:p></o:p></span></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<b><u><span style="font-family: Arial, sans-serif;">August 2013<o:p></o:p></span></u></b></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<span style="font-family: Arial, sans-serif;">This trial was more successful than the last as regards the
speed and efficiency of the data search tools, despite there still being only a
five per cent random, if temporally representative, sample of websites
available. Somewhat paradoxically, those searches which pinpointed the early
years of Internet use, namely 1996 and 1997, proved to be the most valuable.
Several different searches were tested on this occasion, as follows:<o:p></o:p></span></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<b><span style="font-family: Arial, sans-serif;">a)</span></b><span style="font-family: Arial, sans-serif;"> A search for the terms “French
community” was filtered by language, using the “French” option. This
functionality was found to be extremely useful in reducing the large amount of
irrelevant data to a more manageable subset. Again, by filtering further, this
time by year (in this case 1996 and 1997), I was able to focus in on yet more pertinent
Web pages. Thus, when I began to analyse the <Associations Françaises>
site, I noted that the landing page directed the visitor to separate sites, one
for French expatriates and one for Belgians. Not only are these sites an
indication of the relative establishment of the said Francophone communities in
the UK, each warranting an on-line home for the long list of associations set
up in the country of residence, but the fact that a distinction is made between
Belgian and Franco-French populations has implications regarding identity.<o:p></o:p></span></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<span style="font-family: Arial, sans-serif;">Using the same search terms, another site <Les
Grenouilles Cablées>, harvested in 1996, proved worthy of an initial
analysis. Firstly, the landing page pointed the visitor in the direction of
three separate sub-sections: <Grenouilles du monde>, <Grenouilles des
USA> and <Grenouilles de Californie>. These distinctions suggest that
either the French expatriate community was more significant in the USA than
elsewhere at that time (including London, which is no longer the case and
perhaps related to the opening of European borders) or that US residents,
including French ones, were earlier adopters of Internet technology than in the
UK. When examining the site more closely and entering the <o:p></o:p></span></div>
<div class="MsoNormal">
<span style="font-family: Arial, sans-serif;"><Grenouilles du monde> space, it was telling that the
first choice was then <Nouvelles de France> (before the hyperlink to
Quebec), which suggests that this website is indeed aimed at the French expat
diaspora worldwide, linked together by their shared affinity to France, and
keen to maintain links with the homeland. Further, when choosing the French
news link, the selection of newspapers available was a left-leaning one. Again,
the possible implications of this are two-fold: either the political leanings
of the newspapers featured are an indication of the papers' social commitment,
i.e. making information freely available to all, or they are an indication of
the profile of the diaspora visiting on-line sites at that time, i.e. <i>Libération</i>
and <i>Charlie Hebdo</i> both target a young, left-wing readership. If this is
the case, it is thus a profile at odds with the predominantly right-wing
(particularly at that time) expat community of the South Kensington stereotype,
which serves to substantiate the hypothesis posited at the beginning of this
report. There are also hyperlinks to <Metéo France> (suggestive of a need
for a physical sense of proximity to the homeland, despite the geographical
distance separating the community from it) and to <Les dernières nouvelles
d'Alsace' and <Pariscope>, both of which could be indicative of a longing
for insignificant local minutiae in the globalised age, made possible through
the worldwide Web, as well as pointing towards greater emigration from eastern
France (and Belgium, as confirmed by the first website) and the French capital
than other geographical zones.<o:p></o:p></span></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<span style="font-family: Arial, sans-serif;">This site offers links to French audiovisual sites
including radio and TV and, perhaps more importantly for my research, to two
on-line fora, <French Talk> and <Francopolis> which are evidence of
the formation of both Internet and French <i>communities </i>(despite other
empirical evidence suggesting that the French community per se does not exist,
or if at all, in South Kensington alone). Finally, this website creator's
recommended sites are telling in terms of identity (just as a Blog would be
today in its related networks) especially within the theoretical framework of
Pierre Bourdieu's Habitus, with the Vatican, <i>Charlie Hebdo</i>, the RATP (equivalent to TFL in London) and various
French sports sites (football, Formula 1 and rugby) featuring among others. <o:p></o:p></span></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<span style="font-family: Arial, sans-serif;">Another site displayed following this search was the
<Association des Francophones de Cranfield> in which advice is provided
on low-cost means of transport to France and Belgium. This in itself
demonstrates that the target audience are medium- to long-term French residents
of the UK, rather than short-term visitors, and that they have been attracted
to England by its (Higher) education system – a point which, as incongruous as
it may appear, is compounded in the qualitative data gathered outside the AADDA
project. <o:p></o:p></span></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<b><span style="font-family: Arial, sans-serif;">b) </span></b><span style="font-family: Arial, sans-serif;">The second search undertaken in the
August trial was “London French” by “content type”, notably “image”. This was
highly disappointing and of little use given that the few images which were
displayed related to French football or simply contained a set of codes, with
no discernible image. <o:p></o:p></span></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<b><span style="font-family: Arial, sans-serif;">c) </span></b><span style="font-family: Arial, sans-serif;">To counter the insufficiency of the
image search above, a “format search” was instead chosen from the AADDA
homepage. This was more successful in terms of number, with some 6,369 items
listed for the “French London + format” search trialled, filtered by year
(2006). However, given that the images were not tagged and stood in complete
isolation, their usefulness was questionable, as many appeared to relate not to
the French community in London, but linked to websites on French property or university
Webpages.<o:p></o:p></span></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<b><span style="font-family: Arial, sans-serif;">d)</span></b><span style="font-family: Arial, sans-serif;"> This search attempted to assess the
value of the post-code filter, which initially was again rather disappointing.
Given the lack of pertinence of the majority of the sites identified after the
early years (1996, 1997), their related post-codes were of equal irrelevance.
Furthermore, there were no apparent clusters of London websites, with many
coming from outsi</span><span style="font-family: "Arial","sans-serif"; mso-bidi-font-family: Tahoma;">de London; no
micro-geographical/demographic conclusions could therefore be drawn. A
subsequent search (“French community” filtered by language and year), despite
listing only one Website, revealed two potentially telling post-codes, N7 and
NW5, for 2010, which could have been related to the forthcoming opening of a
new French State school in Kentish Town (NW5) (but the insignificant numbers
involved are again inconclusive). <o:p></o:p></span></div>
<div class="MsoNormal">
<span style="font-family: "Arial","sans-serif"; mso-bidi-font-family: Tahoma;"> <o:p></o:p></span></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<b><span style="font-family: "Arial","sans-serif"; mso-bidi-font-family: Tahoma;">e)</span></b><span style="font-family: "Arial","sans-serif"; mso-bidi-font-family: Tahoma;"> A search for “communauté française”, filtered by year (2001) and
language (French) identified a Blog, which would have been of particular
pertinence to my research. However, it transpired that the said Blog was the
work of an English-speaker, practising their written French, rather than a
French Londoner's Blog. The lack of Blogs retrieved by the DDA search engine
was perplexing, as many are known to me within the framework of my UK Web
Archive Special Collection work. The question of whether this is due to the
domains favoured by the London French Bloggers as hosts for their
autobiographical logs is therefore worth consideration, and if so, the
possibility of accessing them through the DDA should also be contemplated. <o:p></o:p></span></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<b><span style="font-family: "Arial","sans-serif"; mso-bidi-font-family: Tahoma;">f) </span></b><span style="font-family: "Arial","sans-serif"; mso-bidi-font-family: Tahoma;">The same search as in item (e), this time written in and filtered by
the English language for the year 2010, found only one Website, the <Ile aux
enfants> school in North London. Despite the unexpected limitedness of the
search results in this case, the “links to host” tool was telling, particularly
in terms of “mapping the field” and Bourdieu's “three-stage analysis” paradigm.
That is, by scrutinising the – predominantly institutional – list of Websites
linked to the <Ile aux enfants>, such as <ambafrance>,
<assemblee-afe>, <bienvenuealondres> and <edufrance>,
socio-cultural assessments were facilitated. Nevertheless, it was frustrating
that these links to the host site were not functioning during the trial,
directing the visitor back to the host page as opposed to opening the linked
Webpage itself. It was not clear, therefore, whether their inclusion was
exclusively for quantitative analysis (the number of visits was in brackets),
as they were of no qualitative worth without access to the content of the
linked Websites. <o:p></o:p></span></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<b><span style="font-family: "Arial","sans-serif"; mso-bidi-font-family: Tahoma;">September 2013<o:p></o:p></span></b></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<span style="font-family: "Arial","sans-serif"; mso-bidi-font-family: Tahoma;">The most notable and satisfying difference between this trial and the
preceding ones was that all the links to related Websites were at least
partially, and in the great majority of cases completely, successful. This
meant that the discovery of one website (from<o:p></o:p></span></div>
<div class="MsoNormal">
<span style="font-family: "Arial","sans-serif"; mso-bidi-font-family: Tahoma;"> a long list of still relatively
futile others), namely the “Londoscope” reference pages of the
<www.acticours.freeserve.co.uk> proved to be invaluable through its
hyperlinks, as opposed to the content of the site itself. Thus, several
pertinent results were attained, as detailed below:<o:p></o:p></span></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<b><span style="font-family: "Arial","sans-serif"; mso-bidi-font-family: Tahoma;">a)</span></b><span style="font-family: "Arial","sans-serif"; mso-bidi-font-family: Tahoma;"> The apparition of London French social-networking-type pages, known as
<Londoscope> is perhaps indicative of the growing numbers of French
Londoners seeking a physical sense of community by means of digital linking and
dissemination mechanisms. Entries such
as “Eglise protestante française de Londres: Soirée anti-stress” and the enumeration
of French films on show at the Ciné Lumière and the NFT, together with other
French cultural events at the Institute of Contemporary Arts bears witness to
the importance of French culture to London's overall cultural capital and is
also evidence of community belonging in
practice.<o:p></o:p></span></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<b><span style="font-family: "Arial","sans-serif"; mso-bidi-font-family: Tahoma;">b)</span></b><span style="font-family: "Arial","sans-serif"; mso-bidi-font-family: Tahoma;"> The <Londoscope> pages from 2003 enabled the identification of a
culturally and historically pertinent French amateur dramatics group which has
been performing in London since 1929: Le Cercle dramatique français (CDF). My
research into this amateur theatre company can now be taken forward in an
effort to ascertain whether it is still in existence and, if so, its place in
French community life today.<o:p></o:p></span></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<b><span style="font-family: "Arial","sans-serif"; mso-bidi-font-family: Tahoma;">c)</span></b><span style="font-family: "Arial","sans-serif"; mso-bidi-font-family: Tahoma;"> Another link on the same Website, from 2004, referred to the Francophone
television channel TV5 celebrating its 20<sup>th</sup> anniversary and revealed
some useful viewer figures, including it being watched in 167 million
households in 2003, with some 56 million weekly viewers. This constitutes
further evidence as to the impact of the French language and culture worldwide
and potentially to the growing French diaspora. <o:p></o:p></span></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<b><span style="font-family: "Arial","sans-serif"; mso-bidi-font-family: Tahoma;">d)</span></b><span style="font-family: "Arial","sans-serif"; mso-bidi-font-family: Tahoma;"> The final finding of relevance during this trial session was the
<Londoscope> link to the ADFE (Association Démocratique des Français à
l'Etranger), created in 1980 'par des Français qui voulaient, pour les
représenter, une association dynamique et correspondant aux nouvelles réalités
de l'expatriation' (i.e. by French people who sought representation through a
dynamic association in tune with the new realities of expatriation). This
quotation alone is of worth for a number of reasons; firstly the notion of
'representation' itself is key, as it begs the question of 'representation <i>to
whom</i>?', which, reading further, it appears is to the French authorities.
This in turn indicates that the need to be politically represented in France
has its roots much further back historically than the election in 2012 of the
first ever </span><span style="font-family: "Arial","sans-serif";">Député </span><span lang="EN" style="font-family: "Arial","sans-serif"; mso-ansi-language: EN;">for
French overseas residents implies</span><span style="font-family: "Arial","sans-serif"; mso-bidi-font-family: Tahoma;">, as well as demonstrating an unwillingness to
integrate fully in the London socio-political scene and an attachment to the
homeland. Similarly, the notion of “new realities” suggests a shift from an old
form of migration to a new one, acting as a temporal forerunner to the massive
wave of cross-Channel immigration which began in the early nineties and
continues to this day. The term “dynamique” could also be seen to illustrate
the London “pull factor” for French expats living in the capital; that is, many
are arguably escaping the inertia and complacency of French institutions and
mindsets in their decision to emigrate to London, as exemplified in other forms
of empirical evidence gathered for this research. Here, therefore, the data
gathered from a single Website in the DDA has served to triangulate several key
findings in my PhD. <o:p></o:p></span></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<b><span style="font-family: "Arial","sans-serif"; mso-bidi-font-family: Tahoma;">October 2013<o:p></o:p></span></b></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<span style="font-family: "Arial","sans-serif"; mso-bidi-font-family: Tahoma;">Having exhausted most of the available search options during the
previous trial sessions, this was the shortest and least enlightening of all.
It was necessary, nonetheless, to conduct a final test
with</span><span style="font-family: Arial, sans-serif;"> the most functional
interface to date and a now complete data set. The search tools also provided
an opportunity for sentiment analysis, unavailable in previous trials.<o:p></o:p></span></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<span style="font-family: Arial, sans-serif;">This experiment
involved a phrase search for “London French community” combined with “English language”
and “very negative” sentiment filters. No results were identified. When the
French language was used and chosen as a filter, 240 matches were found, but
these were of little relevance to my research given their pedagogical focus.
One potentially valuable find for historians of the French presence in the UK
was a Website on Augustine monks, in which the flight of monks from France
during and after the French Revolution, and the creation of brotherhoods in
York (1802), Bristol (1818) and Ealing (1897), where a Benedictine monastery
was founded, were reported. However, in view of the contemporary emphasis of my
research, this proved of little relevance, once again. <o:p></o:p></span></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<span style="font-family: Arial, sans-serif;">Further searches,
using different phrases/words, content types and language/sentiment filters
were also trialled, to no avail. Furthermore, it was disappointing to note that
the post-code and media filters appeared to have been removed, or were not
readily visible. <o:p></o:p></span></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<span style="font-family: Arial, sans-serif;">Overall, if not the
least successful of the trials conducted to date, this was the most frustrating,
given the unfulfilled aspirations of working with the complete data set.<o:p></o:p></span></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<br /></div>
<table border="0" cellpadding="0" cellspacing="0" class="MsoNormalTable" style="border-collapse: collapse; margin-left: 2.75pt; mso-padding-alt: 2.75pt 2.75pt 2.75pt 2.75pt; mso-table-layout-alt: fixed;">
<tbody>
<tr>
<td style="border: solid black 1.0pt; mso-border-alt: solid black .25pt; padding: 2.75pt 2.75pt 2.75pt 2.75pt; width: 482.0pt;" valign="top" width="643">
<div class="MsoNormal">
<b><span style="font-family: "Arial","sans-serif"; font-size: 14.0pt; mso-bidi-font-family: Tahoma;">3 -
Lessons Learnt<o:p></o:p></span></b></div>
</td>
</tr>
</tbody></table>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<span style="font-family: "Arial","sans-serif"; mso-bidi-font-family: Tahoma;">The lessons learnt from this exercise are as follows:<o:p></o:p></span></div>
<div class="MsoNormal">
<br /></div>
<ul style="margin-top: 0cm;" type="disc">
<li class="MsoNormal"><span style="font-family: "Arial","sans-serif"; mso-bidi-font-family: Tahoma;">“Think
small” – minimising one’s research objectives is perhaps the only way of
navigating the enormity of the data.<o:p></o:p></span></li>
<li class="MsoNormal"><span style="font-family: "Arial","sans-serif"; mso-bidi-font-family: Tahoma;">Maximise
material – as the deep search process is akin to searching for the
proverbial needle in a haystack, any relevant data identified as being
pertinent should be analysed immediately, or saved for subsequent analysis,
due to the apparent randomness of the retrieval process. <o:p></o:p></span></li>
<li class="MsoNormal"><span style="font-family: "Arial","sans-serif"; mso-bidi-font-family: Tahoma;">Use
big data for its quantitative value, but not for drawing representative
conclusions or in an attempt to test large-scale hypotheses, due to the
apparent fallibility of the findings. Therefore, restrict qualitative
research to the micro-findings of those Web sites and Web pages found to
be of value – albeit somewhat arbitrarily – and optimise this data for its
comparative and preservation worth. <o:p></o:p></span></li>
</ul>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<br /></div>
<table border="0" cellpadding="0" cellspacing="0" class="MsoNormalTable" style="border-collapse: collapse; margin-left: 2.75pt; mso-padding-alt: 2.75pt 2.75pt 2.75pt 2.75pt; mso-table-layout-alt: fixed;">
<tbody>
<tr>
<td style="border: solid black 1.0pt; mso-border-alt: solid black .25pt; padding: 2.75pt 2.75pt 2.75pt 2.75pt; width: 482.0pt;" valign="top" width="643">
<div class="MsoNormal">
<b><span style="font-family: "Arial","sans-serif"; font-size: 14.0pt; mso-bidi-font-family: Tahoma;">4 -
Future research and AADDA Recommendations<o:p></o:p></span></b></div>
</td>
</tr>
</tbody></table>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<span style="font-family: "Arial","sans-serif"; mso-bidi-font-family: Tahoma;">As regards my own research, I intend to explore the identity / Habitus
evidence found in early Websites (1996 / 1997) in greater detail and compare it
with contemporary Blogs to establish whether the same affiliations are present
and the same sense of group, or otherwise, identity. These findings will also
be compared and triangulated with the qualitative data gathered from one-to-one
interviews with members of the contemporary French population in London. It is
also possible that I will study sample historical Websites / Webpages alongside
their contemporary equivalents, from a multimodal perspective, to gain an
understanding of how technological constraints might influence the making of
meaning to varying degrees over time.<o:p></o:p></span></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<span style="font-family: "Arial","sans-serif"; mso-bidi-font-family: Tahoma;">It is unlikely that the post-code filter searches will be used to
inform my research, given the weakness of the findings, but the process was
worthwhile in its disproving of my theory, and some cautious, small-scale
conclusions could be drawn from the associations with the NW5 district. <o:p></o:p></span></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<span style="font-family: "Arial","sans-serif"; mso-bidi-font-family: Tahoma;">With respect to the AADDA project looking forward, the following
recommendations have been tentatively made:<o:p></o:p></span></div>
<div class="MsoNormal">
<br /></div>
<ul style="margin-top: 0cm;" type="disc">
<li class="MsoNormal"><span style="font-family: "Arial","sans-serif"; mso-bidi-font-family: Tahoma;">(colour?)
coding to indicate both sites already visited and replica Webpages
(identified repeatedly according to the sweep date)<o:p></o:p></span></li>
</ul>
<div class="MsoNormal">
<br /></div>
<ul style="margin-top: 0cm;" type="disc">
<li class="MsoNormal"><span style="font-family: "Arial","sans-serif"; mso-bidi-font-family: Tahoma;">inclusion
of a Blog filter (in addition to the <.org.uk>, <.ac.uk>, etc.
domain filters <o:p></o:p></span></li>
</ul>
<div class="MsoNormal">
<br /></div>
<ul style="margin-top: 0cm;" type="disc">
<li class="MsoNormal"><span style="font-family: "Arial","sans-serif"; mso-bidi-font-family: Tahoma;">“links
to host” tool to open link in new window<o:p></o:p></span></li>
</ul>
<div class="MsoNormal">
<br /></div>
<ul style="margin-top: 0cm;" type="disc">
<li class="MsoNormal"><span style="font-family: "Arial","sans-serif"; mso-bidi-font-family: Tahoma;">retention
of the post-code and media type (image, audio, video, etc.) filters, with
tags / provenance if possible<o:p></o:p></span></li>
</ul>
<div class="MsoNormal">
<br /></div>
<ul style="margin-top: 0cm;" type="disc">
<li class="MsoNormal"><span style="font-family: "Arial","sans-serif"; mso-bidi-font-family: Tahoma;">user-friendly
search “help” / search “tutorial” function as cursor hovers over certain
fields (such as host links, domains, numbers, etc.) on the deep search
landing page (giving particular advice regarding correct wording and
punctuation, for example)<o:p></o:p></span></li>
</ul>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<br /></div>
<table border="0" cellpadding="0" cellspacing="0" class="MsoNormalTable" style="border-collapse: collapse; margin-left: 2.75pt; mso-padding-alt: 2.75pt 2.75pt 2.75pt 2.75pt; mso-table-layout-alt: fixed;">
<tbody>
<tr>
<td style="border: solid black 1.0pt; mso-border-alt: solid black .25pt; padding: 2.75pt 2.75pt 2.75pt 2.75pt; width: 482.0pt;" valign="top" width="643">
<div class="MsoNormal">
<b><span style="font-family: "Arial","sans-serif"; font-size: 14.0pt; mso-bidi-font-family: Tahoma;">5 -
Conclusion<o:p></o:p></span></b></div>
</td>
</tr>
</tbody></table>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<span style="font-family: "Arial","sans-serif"; mso-bidi-font-family: Tahoma;">The lasting impression, having carried out several trial sessions using
the DDA data and its current search tools, is that the results can present
islands of valuable resources within a sea of irrelevant material, but that the
likelihood of finding them is dictated by chance rather than design. Throughout
this testing process, I have pondered the reason for the seemingly arbitrary
nature of my AADDA findings and for my failure to access a greater amount of
material relevant to my research; that is, the question of whether my lack of
technological expertise was the cause or whether such outcomes are inherent to
searching this vast set of data has been recurrent and remains unanswered.
Instructions offering clear guidelines on the best ways to use the archive and
acknowledging its limitations would therefore be both helpful and reassuring to
researchers.</span></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<span style="font-family: "Arial","sans-serif"; mso-bidi-font-family: Tahoma;"> <o:p></o:p></span></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<br /></div>
<br />
<div class="MsoNormal">
<br /></div>
Jonathan Blaneyhttp://www.blogger.com/profile/15856886701364691512noreply@blogger.com0tag:blogger.com,1999:blog-2475624895636745490.post-50399273115521741342013-10-16T16:46:00.002+01:002013-10-16T16:53:33.269+01:00Researchers' final reports (1)Our project researchers on AADDA have kindly written up the research the planned to do with the web archive, a summary of how it went and problems that they encountered. I'll be posting these as blog posts over the next few months. Here is the first, from Helen Taylor:<br />
<br />
<br />
<div class="MsoNormal" style="text-align: justify;">
<b><u>AADDA Report: Sentiment Analysis and the Reception of the Liverpool
Poets<o:p></o:p></u></b></div>
<div class="MsoNormal" style="text-align: justify;">
<br /></div>
<div class="MsoNormal" style="text-align: justify;">
<u>My project and the AADDA: a
lesson in ‘digging down’<o:p></o:p></u></div>
<div class="MsoNormal" style="text-align: justify;">
<br /></div>
<div class="MsoNormal" style="text-align: justify;">
When I proposed my research
project for the Analytical Access to the Domain Dark Archive project, it was
based on a ‘wish list’ of tools that scholars might want to use to access this
resource. The tools my proposed project required were sentiment analysis,
proximity search, and geo-indexing. This latter was not available during this
test period, but the first two were. However, this report is not so much a
record of my findings, but about not making assumptions with the data produced
via these two tools. </div>
<div class="MsoNormal" style="text-align: justify;">
<br /></div>
<div class="MsoNormal" style="text-align: justify;">
I sought to access information
about the reception of the Liverpool Poets (in practise, I focused solely on
Adrian Henri). With the Domain Dark Archive I could find avenues – fan pages,
forums, and the like – which would provide me with information to consider
alongside newspapers, interviews, and archival material. I wanted to see what
labels were attached to the poets, and how they were viewed, in informal
recollections and non-academic contexts. I would then combine and compare this
data with searches for the same terms from newspaper and published works. There
is a marked difference in academic and popular attitudes to the poets, and the
internet archival searches should be able to provide evidence for how the
people who actually received the work viewed their experiences. </div>
<div class="MsoNormal" style="text-align: justify;">
<br /></div>
<div class="MsoNormal">
<u>Methodology: considerations and consequences</u></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal" style="text-align: justify;">
It must be noted that the AADDA
project involved only a slice of the full dataset, and that my results will
almost certainly differ greatly when it goes live. (Just as an example, a
search for “Adrian Henri” on the AADDA browser returns 1847 results, compared
to over 8,200 current UK hits on Google.) The <i>lack</i> of references is almost certainly due to the smaller dataset,
rather than the data not being there at all (1).</div>
<div class="MsoNormal" style="text-align: justify;">
<br /></div>
<div class="MsoNormal" style="text-align: justify;">
Another issue was that very
search term, “Adrian Henri”. Searching for just ‘Adrian’ or ‘Henri’ rather than
‘Adrian Henri’ is unhelpful in that it throws up results of which the majority
are not relevant: ‘“Henri” NEAR “painter”’ might give you Matisse; ‘“Adrian”
NEAR “poet”’ might give you Mitchell. My own research and interview experience
has been that people are likely to refer to him as ‘Henri’ or as ‘Adrian’, so
the fact that I was only searching for ‘Adrian Henri’ might have excluded some
results. However, articles on online magazines and the like do usually follow
academic and journalistic traditions of referring to the subject by their full
name in the first instance, and then surname, so therefore are caught by the
crawl.</div>
<div class="MsoNormal" style="text-align: justify;">
<br /></div>
<div class="MsoNormal" style="text-align: justify;">
I had to decide what labels to
search for in relation to Henri, and my initial searches – using what terms I
was already aware of – may have excluded other labels and ways of talking about
Henri. I also found that my own academic assumptions were not the standard –
there were 203 results for the label ‘Liverpool poet’, versus only 3 for
‘Merseybeat poet’, the term I am using in my thesis! </div>
<div class="MsoNormal">
<br /></div>
<div align="center">
<table border="1" cellpadding="0" cellspacing="0" class="MsoNormalTable" style="border-collapse: collapse; border: none; mso-border-alt: solid windowtext .5pt; mso-border-insideh: .5pt solid windowtext; mso-border-insidev: .5pt solid windowtext; mso-padding-alt: 0cm 5.4pt 0cm 5.4pt; mso-yfti-tbllook: 1184;">
<tbody>
<tr style="height: 13.25pt; mso-yfti-firstrow: yes; mso-yfti-irow: 0;">
<td style="border: solid windowtext 1.0pt; height: 13.25pt; mso-border-alt: solid windowtext .5pt; padding: 0cm 5.4pt 0cm 5.4pt; width: 167.5pt;" valign="top" width="223"><div class="MsoNormal">
<span style="font-size: 10.0pt;">Search for ‘“Adrian Henri”
AND …’<o:p></o:p></span></div>
</td>
<td style="border-left: none; border: solid windowtext 1.0pt; height: 13.25pt; mso-border-alt: solid windowtext .5pt; mso-border-left-alt: solid windowtext .5pt; padding: 0cm 5.4pt 0cm 5.4pt; width: 127.6pt;" valign="top" width="170"><div class="MsoNormal">
<span style="font-size: 10.0pt;">Number of items returned<o:p></o:p></span></div>
</td>
</tr>
<tr style="height: 12.5pt; mso-yfti-irow: 1;">
<td style="border-top: none; border: solid windowtext 1.0pt; height: 12.5pt; mso-border-alt: solid windowtext .5pt; mso-border-top-alt: solid windowtext .5pt; padding: 0cm 5.4pt 0cm 5.4pt; width: 167.5pt;" valign="top" width="223"><div class="MsoNormal">
<span style="font-size: 10.0pt;">“painter and poet”<o:p></o:p></span></div>
</td>
<td style="border-bottom: solid windowtext 1.0pt; border-left: none; border-right: solid windowtext 1.0pt; border-top: none; height: 12.5pt; mso-border-alt: solid windowtext .5pt; mso-border-left-alt: solid windowtext .5pt; mso-border-top-alt: solid windowtext .5pt; padding: 0cm 5.4pt 0cm 5.4pt; width: 127.6pt;" valign="top" width="170"><div class="MsoNormal">
<span style="font-size: 10.0pt;">5<o:p></o:p></span></div>
</td>
</tr>
<tr style="height: 13.25pt; mso-yfti-irow: 2;">
<td style="border-top: none; border: solid windowtext 1.0pt; height: 13.25pt; mso-border-alt: solid windowtext .5pt; mso-border-top-alt: solid windowtext .5pt; padding: 0cm 5.4pt 0cm 5.4pt; width: 167.5pt;" valign="top" width="223"><div class="MsoNormal">
<span style="font-size: 10.0pt;">“poet and painter”<o:p></o:p></span></div>
</td>
<td style="border-bottom: solid windowtext 1.0pt; border-left: none; border-right: solid windowtext 1.0pt; border-top: none; height: 13.25pt; mso-border-alt: solid windowtext .5pt; mso-border-left-alt: solid windowtext .5pt; mso-border-top-alt: solid windowtext .5pt; padding: 0cm 5.4pt 0cm 5.4pt; width: 127.6pt;" valign="top" width="170"><div class="MsoNormal">
<span style="font-size: 10.0pt;">2<o:p></o:p></span></div>
</td>
</tr>
<tr style="height: 12.5pt; mso-yfti-irow: 3;">
<td style="border-top: none; border: solid windowtext 1.0pt; height: 12.5pt; mso-border-alt: solid windowtext .5pt; mso-border-top-alt: solid windowtext .5pt; padding: 0cm 5.4pt 0cm 5.4pt; width: 167.5pt;" valign="top" width="223"><div class="MsoNormal">
<span style="font-size: 10.0pt;">“painter/poet”<o:p></o:p></span></div>
</td>
<td style="border-bottom: solid windowtext 1.0pt; border-left: none; border-right: solid windowtext 1.0pt; border-top: none; height: 12.5pt; mso-border-alt: solid windowtext .5pt; mso-border-left-alt: solid windowtext .5pt; mso-border-top-alt: solid windowtext .5pt; padding: 0cm 5.4pt 0cm 5.4pt; width: 127.6pt;" valign="top" width="170"><div class="MsoNormal">
<span style="font-size: 10.0pt;">5<o:p></o:p></span></div>
</td>
</tr>
<tr style="height: 13.25pt; mso-yfti-irow: 4;">
<td style="border-top: none; border: solid windowtext 1.0pt; height: 13.25pt; mso-border-alt: solid windowtext .5pt; mso-border-top-alt: solid windowtext .5pt; padding: 0cm 5.4pt 0cm 5.4pt; width: 167.5pt;" valign="top" width="223"><div class="MsoNormal">
<span style="font-size: 10.0pt;">“poet/painter”<o:p></o:p></span></div>
</td>
<td style="border-bottom: solid windowtext 1.0pt; border-left: none; border-right: solid windowtext 1.0pt; border-top: none; height: 13.25pt; mso-border-alt: solid windowtext .5pt; mso-border-left-alt: solid windowtext .5pt; mso-border-top-alt: solid windowtext .5pt; padding: 0cm 5.4pt 0cm 5.4pt; width: 127.6pt;" valign="top" width="170"><div class="MsoNormal">
<span style="font-size: 10.0pt;">10<o:p></o:p></span></div>
</td>
</tr>
<tr style="height: 13.25pt; mso-yfti-irow: 5;">
<td style="border-top: none; border: solid windowtext 1.0pt; height: 13.25pt; mso-border-alt: solid windowtext .5pt; mso-border-top-alt: solid windowtext .5pt; padding: 0cm 5.4pt 0cm 5.4pt; width: 167.5pt;" valign="top" width="223"><div class="MsoNormal">
<span style="font-size: 10.0pt;">“performance poet”<o:p></o:p></span></div>
</td>
<td style="border-bottom: solid windowtext 1.0pt; border-left: none; border-right: solid windowtext 1.0pt; border-top: none; height: 13.25pt; mso-border-alt: solid windowtext .5pt; mso-border-left-alt: solid windowtext .5pt; mso-border-top-alt: solid windowtext .5pt; padding: 0cm 5.4pt 0cm 5.4pt; width: 127.6pt;" valign="top" width="170"><div class="MsoNormal">
<span style="font-size: 10.0pt;">0<o:p></o:p></span></div>
</td>
</tr>
<tr style="height: 12.5pt; mso-yfti-irow: 6;">
<td style="border-top: none; border: solid windowtext 1.0pt; height: 12.5pt; mso-border-alt: solid windowtext .5pt; mso-border-top-alt: solid windowtext .5pt; padding: 0cm 5.4pt 0cm 5.4pt; width: 167.5pt;" valign="top" width="223"><div class="MsoNormal">
<span style="font-size: 10.0pt;">“performer”<o:p></o:p></span></div>
</td>
<td style="border-bottom: solid windowtext 1.0pt; border-left: none; border-right: solid windowtext 1.0pt; border-top: none; height: 12.5pt; mso-border-alt: solid windowtext .5pt; mso-border-left-alt: solid windowtext .5pt; mso-border-top-alt: solid windowtext .5pt; padding: 0cm 5.4pt 0cm 5.4pt; width: 127.6pt;" valign="top" width="170"><div class="MsoNormal">
<span style="font-size: 10.0pt;">10<o:p></o:p></span></div>
</td>
</tr>
<tr style="height: 13.25pt; mso-yfti-irow: 7; mso-yfti-lastrow: yes;">
<td style="border-top: none; border: solid windowtext 1.0pt; height: 13.25pt; mso-border-alt: solid windowtext .5pt; mso-border-top-alt: solid windowtext .5pt; padding: 0cm 5.4pt 0cm 5.4pt; width: 167.5pt;" valign="top" width="223"><div class="MsoNormal">
<span style="font-size: 10.0pt;">“entertainer”<o:p></o:p></span></div>
</td>
<td style="border-bottom: solid windowtext 1.0pt; border-left: none; border-right: solid windowtext 1.0pt; border-top: none; height: 13.25pt; mso-border-alt: solid windowtext .5pt; mso-border-left-alt: solid windowtext .5pt; mso-border-top-alt: solid windowtext .5pt; padding: 0cm 5.4pt 0cm 5.4pt; width: 127.6pt;" valign="top" width="170"><div class="MsoNormal">
<span style="font-size: 10.0pt;">16<o:p></o:p></span></div>
</td>
</tr>
</tbody></table>
</div>
<div align="center" class="MsoNormal" style="text-align: center;">
<i><span style="font-size: 10.0pt;">Fig 1 – examples of search terms and
results<o:p></o:p></span></i></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal" style="text-align: justify;">
The five results for both
“painter and poet” and “painter/poet” were all from the Tate Archives.(2) This – with search terms placing the artistic side of his output first – is not
surprising, given that the Tate is an art gallery. It did surprise me that
“performance poet” did not prove a useful search term, although this is perhaps
an academic designation rather than a layman’s term – as evidenced by the
results for “entertainer”. But none of these results can be taken at face
value, as this report shall discuss.</div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<u>Boolean searching: How near is NEAR? <o:p></o:p></u></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal" style="text-align: justify;">
These initial exploratory
searches bring me to my first problem with the data. Throughout this report
what I refer to as problems are not faults with the dataset or the browser but
rather potential issues for the users interacting with it. Parameters for how close
together the two search terms can differ, but I found that the NEAR search was
sometimes not near enough here. I found two issues when reading the actual
results: firstly, that the terms were often not that close together; and
second, that the second term was not actually being used to discuss Henri:</div>
<div class="MsoNormal" style="text-align: justify;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiwGi6HWLRtGS4nRLgT3Crc0_Yj7zU6nBIQQdXFMn4PNR_7wFapcPDOf8XSu2bRqGEmWmup6Pm-Mi45TsAf6yaoHlcD20qkAGnrmE-0EVcME2TWrDBKZ37RetdIoOb4xZ4hYuo04wKBK41A/s1600/Helen+Taylor+figure+2.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="128" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiwGi6HWLRtGS4nRLgT3Crc0_Yj7zU6nBIQQdXFMn4PNR_7wFapcPDOf8XSu2bRqGEmWmup6Pm-Mi45TsAf6yaoHlcD20qkAGnrmE-0EVcME2TWrDBKZ37RetdIoOb4xZ4hYuo04wKBK41A/s320/Helen+Taylor+figure+2.png" width="320" /></a></div>
<br />
<div align="center" class="MsoNormal" style="text-align: center;">
<!--[if gte vml 1]><v:shapetype id="_x0000_t75"
coordsize="21600,21600" o:spt="75" o:preferrelative="t" path="m@4@5l@4@11@9@11@9@5xe"
filled="f" stroked="f">
<v:stroke joinstyle="miter"/>
<v:formulas>
<v:f eqn="if lineDrawn pixelLineWidth 0"/>
<v:f eqn="sum @0 1 0"/>
<v:f eqn="sum 0 0 @1"/>
<v:f eqn="prod @2 1 2"/>
<v:f eqn="prod @3 21600 pixelWidth"/>
<v:f eqn="prod @3 21600 pixelHeight"/>
<v:f eqn="sum @0 0 1"/>
<v:f eqn="prod @6 1 2"/>
<v:f eqn="prod @7 21600 pixelWidth"/>
<v:f eqn="sum @8 21600 0"/>
<v:f eqn="prod @7 21600 pixelHeight"/>
<v:f eqn="sum @10 21600 0"/>
</v:formulas>
<v:path o:extrusionok="f" gradientshapeok="t" o:connecttype="rect"/>
<o:lock v:ext="edit" aspectratio="t"/>
</v:shapetype><v:shape id="Picture_x0020_15" o:spid="_x0000_i1025" type="#_x0000_t75"
style='width:379.5pt;height:151.5pt;visibility:visible' o:bordertopcolor="black"
o:borderleftcolor="black" o:borderbottomcolor="black" o:borderrightcolor="black">
<v:imagedata src="file:///C:\DOCUME~1\JONATH~1.BLA\LOCALS~1\Temp\msohtmlclip1\01\clip_image001.png"
o:title="" croptop="7061f" cropbottom="15740f" cropleft="1117f" cropright="1613f"/>
<w:bordertop type="single" width="6"/>
<w:borderleft type="single" width="6"/>
<w:borderbottom type="single" width="6"/>
<w:borderright type="single" width="6"/>
</v:shape><![endif]--><!--[if !vml]--><!--[endif]--></div>
<div align="center" class="MsoNormal" style="text-align: center;">
<i><span style="font-size: 10.0pt;">Fig 2 - search result for “Adrian Henri”
NEAR “painter”, post on </span></i><span style="font-size: 10.0pt;"><i><a href="http://www.ancestryaid.co.uk/">www.ancestryaid.co.uk</a> (3)</i></span></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal" style="text-align: justify;">
Therefore, the results in the
table listed above are not a reliable source for enumerating the most common
labels attached to Henri – one cannot rely on reading only the initial search
results.</div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<u>Crawl dates: Encountering a display problem<o:p></o:p></u></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal" style="text-align: justify;">
I have already stated that some
results could not be ‘clicked through’ and their content displayed past that
initial search results page, such as the Tate results for “painter and poet”
and “painter/poet”. There is therefore no way of knowing what the pages
actually contained. At other times, there were results which could not be viewed
for a different reason: they did not even appear on the search results page.</div>
<div class="MsoNormal" style="text-align: justify;">
<br /></div>
<div class="MsoNormal" style="text-align: justify;">
This revealed itself to me when
running an exploratory query. After a basic search for “Adrian Henri”, one of
the things that I noticed is that there is a ‘jump’ in the number of hits in
the year 2000. Whilst this is not the highest number (2007 has 345), I thought
that this could be explained by this being the year that he died – obituaries,
tributes, more ‘noise’ around his name.</div>
<div class="MsoNormal" style="text-align: justify;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjmshDaTqFis_J0pUmn39JWHX6H_g_-Ug8ImNQkMFbcLLFWVWaNSp2VqVdaTJkogDNzbqNC63VISuyZ0EF_o_s2b_wq4vF45_RE2FDOHRlY1vRCn9LFBjkwjHCmrgeVfyxMVuYxfoZm75Qb/s1600/Helen+Taylor+figure+3.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="114" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjmshDaTqFis_J0pUmn39JWHX6H_g_-Ug8ImNQkMFbcLLFWVWaNSp2VqVdaTJkogDNzbqNC63VISuyZ0EF_o_s2b_wq4vF45_RE2FDOHRlY1vRCn9LFBjkwjHCmrgeVfyxMVuYxfoZm75Qb/s320/Helen+Taylor+figure+3.png" width="320" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div class="MsoNormal">
<br /></div>
<div align="center" class="MsoNormal" style="text-align: center;">
<!--[if gte vml 1]><v:shape id="Picture_x0020_12"
o:spid="_x0000_i1026" type="#_x0000_t75" style='width:372pt;height:132.75pt;
visibility:visible' o:bordertopcolor="black" o:borderleftcolor="black"
o:borderbottomcolor="black" o:borderrightcolor="black">
<v:imagedata src="file:///C:\DOCUME~1\JONATH~1.BLA\LOCALS~1\Temp\msohtmlclip1\01\clip_image003.png"
o:title="" croptop="18581f" cropbottom="14067f" cropleft="3701f" cropright="7838f"/>
<w:bordertop type="single" width="6"/>
<w:borderleft type="single" width="6"/>
<w:borderbottom type="single" width="6"/>
<w:borderright type="single" width="6"/>
</v:shape><![endif]--><!--[if !vml]--><!--[endif]--><o:p></o:p></div>
<div align="center" class="MsoNormal" style="text-align: center;">
<i><span style="font-size: 10.0pt;">Fig 3 – showing results for “Adrian
Henri” by crawl year<b> (4)</b></span></i></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
Clicking through to filter these results by that year – and
hoping to find relevant obituary results – I encountered my first problem. From
242 results on the initial search, the “Search found 202 items”:</div>
<div class="MsoNormal">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhHtqJwpazaQuWbwHqNEWamYEl4zxuOfhEZ2sY3V2B5GiT4wYqP_neQLbbd7ZFwmtDIK_PUwGMjcY3au6zOlGosn1VKEsRc9i0HKHXTaAfdfTQ-Yp3vHxFqi4JTxIJg8woA8pBmKxx3NrJP/s1600/Helen+Taylor+figure+4.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="118" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhHtqJwpazaQuWbwHqNEWamYEl4zxuOfhEZ2sY3V2B5GiT4wYqP_neQLbbd7ZFwmtDIK_PUwGMjcY3au6zOlGosn1VKEsRc9i0HKHXTaAfdfTQ-Yp3vHxFqi4JTxIJg8woA8pBmKxx3NrJP/s320/Helen+Taylor+figure+4.png" width="320" /></a></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<br /></div>
<div align="center" class="MsoNormal" style="text-align: center;">
<!--[if gte vml 1]><v:shape id="Picture_x0020_1"
o:spid="_x0000_i1027" type="#_x0000_t75" style='width:373.5pt;height:138pt;
visibility:visible' o:bordertopcolor="black" o:borderleftcolor="black"
o:borderbottomcolor="black" o:borderrightcolor="black">
<v:imagedata src="file:///C:\DOCUME~1\JONATH~1.BLA\LOCALS~1\Temp\msohtmlclip1\01\clip_image005.png"
o:title="" croptop="17096f" cropbottom="14371f" cropleft="3557f" cropright="7831f"/>
<w:bordertop type="single" width="6"/>
<w:borderleft type="single" width="6"/>
<w:borderbottom type="single" width="6"/>
<w:borderright type="single" width="6"/>
</v:shape><![endif]--><!--[if !vml]--><!--[endif]--><o:p></o:p></div>
<div align="center" class="MsoNormal" style="text-align: center;">
<i><span style="font-size: 10.0pt;">Fig 4 – filtering “Adrian Henri” results
by crawl year “2000” (5)</span></i><o:p></o:p></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal" style="text-align: justify;">
Furthermore, when clicking
through to the second page of these already shrinking items, the number jumped
down again to 186: </div>
<div align="center" class="MsoNormal" style="text-align: center;">
<br /></div>
<div align="center" class="MsoNormal" style="text-align: center;">
</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiQvoB2VAc5QZmIKMXJYEMsoSX3SJrJ5htRMGCN_np2BSbZRXXgyFOILdJ_Lp7r23ZM-Dn4QLNPgdvM4r7pMor9qmEm9BOi21qD55OnmGKe0f00aDMZn7jKdCPzhEmkdKSKqvShuprZt5db/s1600/Helen+Taylor+figure+5.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="117" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiQvoB2VAc5QZmIKMXJYEMsoSX3SJrJ5htRMGCN_np2BSbZRXXgyFOILdJ_Lp7r23ZM-Dn4QLNPgdvM4r7pMor9qmEm9BOi21qD55OnmGKe0f00aDMZn7jKdCPzhEmkdKSKqvShuprZt5db/s320/Helen+Taylor+figure+5.png" width="320" /></a></div>
<div align="center" class="MsoNormal" style="text-align: center;">
<i><span style="font-size: 10.0pt;"><br /></span></i></div>
<div align="center" class="MsoNormal" style="text-align: center;">
<i><span style="font-size: 10.0pt;"><br /></span></i></div>
<div align="center" class="MsoNormal" style="text-align: center;">
<i><span style="font-size: 10.0pt;">Fig 5 – filtering “Adrian Henri” results
by crawl year “2000”, page 2 (6)</span></i><a href="file:///C:/Documents%20and%20Settings/jonathan.blaney/Local%20Settings/Temporary%20Internet%20Files/Content.Outlook/5TYE9UNK/Helen%20Taylor%20AADDA%20Report.doc#_ftn6" name="_ftnref6" title=""><span class="MsoFootnoteReference"><i><span style="font-size: 10.0pt;"><!--[endif]--></span></i></span></a></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal" style="text-align: justify;">
This was repeated elsewhere – for
example, the following year, 2001, went from 53 potential results to 37 search
items being displayed. It was not the case that the items were only those which
could be ‘clicked through’ – as the Tate example above shows, those which the
Wayback Machine could not display were still included in the search items.</div>
<div class="MsoNormal" style="text-align: justify;">
<br /></div>
<div class="MsoNormal" style="text-align: justify;">
One potential explanation for the
discrepancy between the total number of results and number of items which the
“search found” is that the results returned here might omit duplications,
perhaps where a second crawl finds nothing different from the first. I am
unsure whether this is a valid response, as I have found many instances of
crawls where the Wayback Machine’s results are exactly the same from crawl to
crawl. Furthermore, of the 242 results for 2000, 235 were from Amazon.co.uk,
and not related to his death. I would, therefore, propose that the ‘jump’ came
simply from there being more crawls in that year, as it must be remembered that
the dates are dates at which the sites were recorded, not the dates at which
the material was published.(7) Whatever the reason, this shows that the results must be interrogated further
along the line from the initial search, as however innocent the numbers appear,
they cannot be presented without ‘digging down’ to the actual website results
themselves.</div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<u>Sentiment Analysis: Don’t take it on face value<o:p></o:p></u></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal" style="text-align: justify;">
Taking a quick look at the totals
when doing a basic search for “Adrian Henri” reveals mostly neutral results, as
one might expect from an analysis over a large amount of text, but the results
are also far more positive than negative, if a sentiment is found – 136 “very
positive” versus 11 “very negative”. However, this is another lesson is
‘digging down’ and not taking the results at face value.</div>
<div class="MsoNormal" style="text-align: justify;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEitGaRtq4qyHMsQztwQ3vqUoUCyAjxv-4xp9aDZRonJigBEh96hE1M7GzqT2PitxLmKCYDu2_EqXzFnLCDflJ5kBNkc3nQRCV6ilRQg4sL8f3dvBOCa6cQWawks8MBMYlpSEcj7owIhwDve/s1600/Helen+Taylor+figure+6.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="129" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEitGaRtq4qyHMsQztwQ3vqUoUCyAjxv-4xp9aDZRonJigBEh96hE1M7GzqT2PitxLmKCYDu2_EqXzFnLCDflJ5kBNkc3nQRCV6ilRQg4sL8f3dvBOCa6cQWawks8MBMYlpSEcj7owIhwDve/s320/Helen+Taylor+figure+6.png" width="320" /></a></div>
<div class="MsoNormal" style="text-align: justify;">
<br /></div>
<div class="MsoNormal">
<br /></div>
<div align="center" class="MsoNormal" style="text-align: center;">
<!--[if gte vml 1]><v:shape id="Picture_x0020_5"
o:spid="_x0000_i1029" type="#_x0000_t75" style='width:376.5pt;height:152.25pt;
visibility:visible' o:bordertopcolor="black" o:borderleftcolor="black"
o:borderbottomcolor="black" o:borderrightcolor="black">
<v:imagedata src="file:///C:\DOCUME~1\JONATH~1.BLA\LOCALS~1\Temp\msohtmlclip1\01\clip_image009.png"
o:title="" croptop="15486f" cropbottom="12389f" cropleft="2977f" cropright="7902f"/>
<w:bordertop type="single" width="6"/>
<w:borderleft type="single" width="6"/>
<w:borderbottom type="single" width="6"/>
<w:borderright type="single" width="6"/>
</v:shape><![endif]--><!--[if !vml]--><!--[endif]--></div>
<div align="center" class="MsoNormal" style="text-align: center;">
<i><span style="font-size: 10.0pt;">Fig 6 – showing sentiment totals for the
“Adrian Henri” search<b> (8)</b></span></i></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal" style="text-align: justify;">
The success of sentiment analysis
relies in part on how positivity or negativity is determined across the whole
search parameters. This quote from a 1998 school newsletter is clearly – and
does indeed appear under the term – very positive:</div>
<div class="MsoNormal" style="text-align: justify;">
<br /></div>
<div class="MsoNormal" style="margin-left: 36.0pt; text-align: justify;">
<span style="background: white;">Many thanks to Stockport Art Gallery staff for the
invitation to bring our Junior children to meet Adrian Henri, the famous artist
and poet, on Wednesday 21 October. Adrian was terrific, telling us the stories
behind many of the pictures currently on exhibition at the Gallery and reading
from his poetry collections. We can really recommend a visit to see his work.
Many thanks to Adrian for a great day with you in Stockport!<span style="font-size: x-small;"> (9)</span></span></div>
<div class="MsoNormal" style="text-align: justify;">
<br /></div>
<div class="MsoNormal" style="text-align: justify;">
However, other results which were
listed as “very positive” must be discounted from this total for the same
reason as the proximity searches above: the positive nature of the whole is not
related to Henri’s part. See, for example, the discussion of Carol Ann Duffy’s <i>The World’s Wife </i>in an AQA English
Literature Examiner’s Report from June 2005: </div>
<div class="MsoNormal" style="text-align: justify;">
<br /></div>
<div class="MsoNormal" style="margin-left: 36.0pt; text-align: justify;">
Once again, <i>The World’s Wife</i> proved highly popular:
more centres study this text than any other on the paper. As last year,
examiners were impressed by the enthusiasm and engagement with which many candidates
approach Duffy’s poetry … Examiners were also concerned that intrusive, and
often irrelevant, biographical material (such as lengthy character
assassinations of Adrian Henri) prevented candidates from meeting the
Assessment Objectives.(10)<a href="file:///C:/Documents%20and%20Settings/jonathan.blaney/Local%20Settings/Temporary%20Internet%20Files/Content.Outlook/5TYE9UNK/Helen%20Taylor%20AADDA%20Report.doc#_ftn10" name="_ftnref10" title=""><span class="MsoFootnoteReference"><span style="font-size: 10.0pt; mso-bidi-font-size: 11.0pt;"><!--[endif]--></span></span></a></div>
<div class="MsoNormal" style="text-align: justify;">
<br /></div>
<div class="MsoNormal" style="text-align: justify;">
Whilst
this, therefore, means one cannot blithely cite all 136 “very positive” results
in Henri’s favour, we also need to revise the total of “very negative” results.
Firstly, of the 11 results, the 6 items which can be displayed are all the same
Peter Finch interview:<o:p></o:p></div>
<div class="MsoNormal" style="text-align: justify;">
<br /></div>
<div align="center" class="MsoNormal" style="text-align: center;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEif5o7G7zS7UFHd0xbAyC0fNHQIW7-gGmdDHexY-pDgw83LbEp3c9Xv-xY0OGF_tv6aZwmLNBmAyxUCfaeq9xu1LV-z__r_ZhkvN4L5dicmLeRUEtJSjP7odV6Xv167sfRMYehjcWOuL-CP/s1600/Helen+Taylor+figure+7.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="162" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEif5o7G7zS7UFHd0xbAyC0fNHQIW7-gGmdDHexY-pDgw83LbEp3c9Xv-xY0OGF_tv6aZwmLNBmAyxUCfaeq9xu1LV-z__r_ZhkvN4L5dicmLeRUEtJSjP7odV6Xv167sfRMYehjcWOuL-CP/s320/Helen+Taylor+figure+7.png" width="320" /></a></div>
<div align="center" class="MsoNormal" style="text-align: center;">
<br /></div>
<div align="center" class="MsoNormal" style="text-align: center;">
<i><span style="font-size: 10.0pt;">Fig 7 – results page “Adrian Henri” with
sentiment “very negative”<b> (11)</b></span></i></div>
<div class="MsoNormal" style="text-align: justify;">
<br /></div>
<div class="MsoNormal" style="text-align: justify;">
And
secondly, in this interview Henri actually appears very favourably:<o:p></o:p></div>
<div class="MsoNormal" style="text-align: justify;">
<br /></div>
<div class="MsoNormal" style="margin-left: 36.0pt; text-align: justify;">
The Liverpool Scene arrived, and with it the merging
of music and poetry with Roger McGough, Brian Patten, Adrian Henri, and others.
I eventually met Adrian Henri, who was also a painter, and the most
interesting, I thought, of the three. We became frends and he pointed me in
some new directions.(12)<o:p></o:p></div>
<div class="MsoNormal" style="text-align: justify;">
<br /></div>
<div class="MsoNormal" style="text-align: justify;">
The
Wayback Machine has 12 captures of this page on this site, from October 2006 to
July 2013. Each crawl obviously takes a snapshot of whatever is on the page at
the time, and the crawl date is clearly indicated in the results, but the 11
apparently different “very negative” results are, in practise, all the exact same
interview, the text of which has not changed (bat the removal of the first line
under the title), although the formatting of the page itself has slightly
changed (see the links beneath the header), as illustrated here:<o:p></o:p></div>
<div class="MsoNormal">
<br /></div>
<div align="center" class="MsoNormal" style="text-align: center;">
<i><!--[if gte vml 1]><v:shape id="Picture_x0020_11"
o:spid="_x0000_i1031" type="#_x0000_t75" style='width:367.5pt;height:123.75pt;
visibility:visible' o:bordertopcolor="black" o:borderleftcolor="black"
o:borderbottomcolor="black" o:borderrightcolor="black">
<v:imagedata src="file:///C:\DOCUME~1\JONATH~1.BLA\LOCALS~1\Temp\msohtmlclip1\01\clip_image013.png"
o:title="" croptop="6689f" cropbottom="21871f" cropright="1201f"/>
<w:bordertop type="single" width="6"/>
<w:borderleft type="single" width="6"/>
<w:borderbottom type="single" width="6"/>
<w:borderright type="single" width="6"/>
</v:shape><![endif]--><!--[if !vml]--><!--[endif]--><o:p></o:p></i></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhbAxT3Rd8unt4dxoonXUoSAlgSIFSahwBRrNF1u9jva0Is5C9V1Y1OJ3pwo1z5kCFaDeJRxaQPIm2x3RLhSEx-CpfAVYseTdyP5wtfE251kBww3L01sFZFQFiO0Ed_IWlasW5R-Rmy9mEq/s1600/Helen+Taylor+figure+8a.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="108" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhbAxT3Rd8unt4dxoonXUoSAlgSIFSahwBRrNF1u9jva0Is5C9V1Y1OJ3pwo1z5kCFaDeJRxaQPIm2x3RLhSEx-CpfAVYseTdyP5wtfE251kBww3L01sFZFQFiO0Ed_IWlasW5R-Rmy9mEq/s320/Helen+Taylor+figure+8a.png" width="320" /></a></div>
<div align="center" class="MsoNormal" style="text-align: center;">
<i><span style="font-size: 10.0pt;">Fig 8a – first Wayback Machine capture
of </span></i><span style="font-size: 10.0pt;"><i><a href="http://www.argotistonline.co.uk/">www.argotistonline.co.uk</a> (13)</i></span></div>
<div align="center" class="MsoNormal" style="text-align: center;">
<br /></div>
<div align="center" class="MsoNormal" style="text-align: center;">
</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiZBs-MAix6gNVLXtmgRzK_DhCrtv8E5jwSWS9yuG9h8_6rl8lO4L10tCdQO2zxd74N9zWAWk__nKmE1eP5eYXMibh1HuWYdabwl6c5ZIGYEi5KzTWYEQoWggQkCmqjSnUFbiNX5tGgo-aW/s1600/Helen+Taylor+figure+8b.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="108" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiZBs-MAix6gNVLXtmgRzK_DhCrtv8E5jwSWS9yuG9h8_6rl8lO4L10tCdQO2zxd74N9zWAWk__nKmE1eP5eYXMibh1HuWYdabwl6c5ZIGYEi5KzTWYEQoWggQkCmqjSnUFbiNX5tGgo-aW/s320/Helen+Taylor+figure+8b.png" width="320" /></a></div>
<div align="center" class="MsoNormal" style="text-align: center;">
<i><span style="font-size: 10.0pt;"><br /></span></i></div>
<div align="center" class="MsoNormal" style="text-align: center;">
<i><span style="font-size: 10.0pt;">Fig 8b – last Wayback Machine capture of
</span></i><span style="font-size: 10.0pt;"><i><a href="http://www.argotistonline.co.uk/">www.argotistonline.co.uk</a> (14)</i></span></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal" style="text-align: justify;">
I have suggested that one reason
for the discrepancy between the total number of results and the items which can
be displayed is that the duplications might not be shown, and the snapshots for
this page do show that there have been changes over time, but what this also
shows is the need to interrogate the results, at the level of those snapshots,
rather than making assumptions based on the initial totals. Whilst this may be
deliberately simplifying the issue, the message to take away here is not to
take the results on face value: there aren’t 11 “very negative” results – there
are none at all!</div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<u>Brief Conclusions<o:p></o:p></u></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal" style="text-align: justify;">
This report has attempted to
present some of the potential mishaps involved with looking at the Web Archive results
on the surface, at face value. What my exploratory searches have shown is that one
cannot make assumptions based purely on looking at the initial search results –
you have to dig down. </div>
<div class="MsoNormal" style="text-align: justify;">
<br /></div>
<div class="MsoNormal" style="text-align: justify;">
Being involved in the AADDA
project was certainly useful for my own research, as I found sources of
information which I wouldn’t have found otherwise, such as pages which are no
longer live, or places I hadn’t thought to look. It was also fascinating to
read non-academic histories of performance poetry and the 1960s underground,
where Henri and the Merseybeat poets appear as far more important than in
‘official’ criticism.<a href="file:///C:/Documents%20and%20Settings/jonathan.blaney/Local%20Settings/Temporary%20Internet%20Files/Content.Outlook/5TYE9UNK/Helen%20Taylor%20AADDA%20Report.doc#_ftn15" name="_ftnref15" title=""><span class="MsoFootnoteReference"><span style="color: black; font-size: 10.0pt; mso-bidi-font-size: 13.5pt;"><!--[if !supportFootnotes]--><span class="MsoFootnoteReference"><span style="font-size: 10pt;">[15]</span></span><!--[endif]--></span></span></a> These histories
were also presented as if public knowledge, proving my theory that those
‘ordinary’ people who received the work did have an idea of its importance, and
that the audiences for this kind of poetry were significant, particularly in
terms of recognising the legacy of the Merseybeat poets where academia has
dismissed them. However, what my research experiences have been far more useful
for, I believe, is pointing up some of the potential issues – both with the
interface (display problems) and the users (making assumptions) – before the
Domain Dark Archive goes live. </div>
<br />
<div>
<!--[if !supportFootnotes]--><br clear="all" />
<hr align="left" size="1" width="33%" />
<!--[endif]-->
<br />
<div id="ftn1">
<div class="MsoFootnoteText">
(1) I am
aware of sites which were not included in the slice available for this initial
project, as well as those without a UK domain suffix which are beyond the scope
of the project, such as <a href="http://www.my-liverpool.co.uk/">www.my-liverpool.co.uk</a>
or <a href="http://www.mudcat.org/">www.mudcat.org</a>. </div>
</div>
<div id="ftn2">
<div class="MsoNormal">
(2) <span style="font-size: 10.0pt;">The Tate Archive results could not be shown by the
Wayback Machine, due to ‘robots.txt’ on the site – see <a href="http://web.archive.org/web/20060824234002/http:/archive.tate.org.uk:80/DServe/dserve.exe?dsqServer=tg_calm&dsqApp=Archive&dsqDb=Catalog&dsqCmd=Browse.tcl&dsqSearch=*(RefNo='TAp*')&dsqKey=RefNo">http://web.archive.org/web/20060824234002/http://archive.tate.org.uk:80/DServe/dserve.exe?dsqServer=tg_calm&dsqApp=Archive&dsqDb=Catalog&dsqCmd=Browse.tcl&dsqSearch=*(RefNo='TAp*')&dsqKey=RefNo</a><o:p></o:p></span></div>
</div>
<div id="ftn3">
<div class="MsoNormal">
(3)<span style="font-size: 10.0pt;"> <a href="http://web.archive.org/web/20070514010256/http:/www.ancestryaid.co.uk:80/boards/archive/index.php/t-928.html">http://web.archive.org/web/20070514010256/http://www.ancestryaid.co.uk:80/boards/archive/index.php/t-928.html</a><o:p></o:p></span></div>
</div>
<div id="ftn4">
<div class="MsoFootnoteText">
(4) <a href="http://www.webarchive.org.uk/aadda-discovery/browse?text=%22adrian+henri%22&sort_by=solr_document&sort_order=ASC">http://www.webarchive.org.uk/aadda-discovery/browse?text=%22adrian+henri%22&sort_by=solr_document&sort_order=ASC</a></div>
</div>
<div id="ftn5">
<div class="MsoFootnoteText">
(5) <a href="http://www.webarchive.org.uk/aadda-discovery/browse?text=%22adrian%20henri%22&sort_by=solr_document&sort_order=ASC&f%5b0%5d=crawl_year%3A%222000%22">http://www.webarchive.org.uk/aadda-discovery/browse?text=%22adrian%20henri%22&sort_by=solr_document&sort_order=ASC&f[0]=crawl_year%3A%222000%22</a></div>
</div>
<div id="ftn6">
<div class="MsoFootnoteText">
(6) <a href="http://www.webarchive.org.uk/aadda-discovery/browse?text=%22adrian%20henri%22&sort_by=solr_document&sort_order=ASC&page=1&f%5b0%5d=crawl_year%3A%222000%22">http://www.webarchive.org.uk/aadda-discovery/browse?text=%22adrian%20henri%22&sort_by=solr_document&sort_order=ASC&page=1&f[0]=crawl_year%3A%222000%22</a></div>
</div>
<div id="ftn7">
<div class="MsoFootnoteText">
(7) This is
something which we have discussed at AADDA meetings, and I feel that the
interface does make this clear, it is just something which should be stressed
to users in any guidance material, to avoid misunderstanding.</div>
</div>
<div id="ftn8">
<div class="MsoFootnoteText">
(8) <a href="http://www.webarchive.org.uk/aadda-discovery/browse?text=%22adrian+henri%22&sort_by=solr_document&sort_order=ASC">http://www.webarchive.org.uk/aadda-discovery/browse?text=%22adrian+henri%22&sort_by=solr_document&sort_order=ASC</a></div>
</div>
<div id="ftn9">
<div class="MsoFootnoteText">
(9) <a href="http://web.archive.org/web/19991008172118/http:/webserv1.stockportmbc.gov.uk:80/pages/links/schools/primary/ourlarc/oct1998.htm">http://web.archive.org/web/19991008172118/http://webserv1.stockportmbc.gov.uk:80/pages/links/schools/primary/ourlarc/oct1998.htm</a></div>
</div>
<div id="ftn10">
<div class="MsoFootnoteText">
(10) <a href="http://web.archive.org/web/20060618094849/http:/www.aqa.org.uk:80/qual/pdf/AQA-5741-6741-WRE-Jun05.pdf">http://web.archive.org/web/20060618094849/http://www.aqa.org.uk:80/qual/pdf/AQA-5741-6741-WRE-Jun05.pdf</a></div>
</div>
<div id="ftn11">
<div class="MsoFootnoteText">
(11) <a href="http://www.webarchive.org.uk/aadda-discovery/browse?text=%22adrian%20henri%22&sort_by=solr_document&sort_order=ASC&f%5b0%5d=sentiment%3A%22Very%20Negative%22">http://www.webarchive.org.uk/aadda-discovery/browse?text=%22adrian%20henri%22&sort_by=solr_document&sort_order=ASC&f[0]=sentiment%3A%22Very%20Negative%22</a></div>
</div>
<div id="ftn12">
<div class="MsoFootnoteText">
(12) <a href="http://web.archive.org/web/20070208145352/http:/www.argotistonline.co.uk:80/Finch%20interview.htm">http://web.archive.org/web/20070208145352/http://www.argotistonline.co.uk:80/Finch%20interview.htm</a></div>
</div>
<div id="ftn13">
<div class="MsoFootnoteText">
(13)<span style="font-size: x-small;"> </span><a href="http://web.archive.org/web/20061019024105/http:/www.argotistonline.co.uk/Finch%20interview.htm">http://web.archive.org/web/20061019024105/http://www.argotistonline.co.uk/Finch%20interview.htm</a></div>
</div>
<div id="ftn14">
<div class="MsoFootnoteText">
(14)<span style="font-size: x-small;"> </span><a href="http://web.archive.org/web/20130723093311/http:/www.argotistonline.co.uk/Finch%20interview.htm">http://web.archive.org/web/20130723093311/http://www.argotistonline.co.uk/Finch%20interview.htm</a></div>
</div>
<div id="ftn15">
<div class="MsoFootnoteText">
(15) See,
for example, <a href="http://web.archive.org/web/19961221024212/http:/www.users.dircon.co.uk:80/~dirkje/pjmanif.htm">http://web.archive.org/web/19961221024212/http://www.users.dircon.co.uk:80/~dirkje/pjmanif.htm</a>
or <a href="http://web.archive.org/web/20020701043924/http:/www.artcircus.org.uk:80/route/version5/paper/paper_article_detail.asp?idno=3">http://web.archive.org/web/20020701043924/http://www.artcircus.org.uk:80/route/version5/paper/paper_article_detail.asp?idno=3</a></div>
</div>
</div>
Jonathan Blaneyhttp://www.blogger.com/profile/15856886701364691512noreply@blogger.com0tag:blogger.com,1999:blog-2475624895636745490.post-57488497198750030262013-06-13T08:22:00.002+01:002013-06-13T08:25:28.499+01:00A page, but not as we know it<!--[if gte mso 9]><xml>
<w:WordDocument>
<w:View>Normal</w:View>
<w:Zoom>0</w:Zoom>
<w:TrackMoves/>
<w:TrackFormatting/>
<w:PunctuationKerning/>
<w:ValidateAgainstSchemas/>
<w:SaveIfXMLInvalid>false</w:SaveIfXMLInvalid>
<w:IgnoreMixedContent>false</w:IgnoreMixedContent>
<w:AlwaysShowPlaceholderText>false</w:AlwaysShowPlaceholderText>
<w:DoNotPromoteQF/>
<w:LidThemeOther>EN-GB</w:LidThemeOther>
<w:LidThemeAsian>X-NONE</w:LidThemeAsian>
<w:LidThemeComplexScript>X-NONE</w:LidThemeComplexScript>
<w:Compatibility>
<w:BreakWrappedTables/>
<w:SnapToGridInCell/>
<w:WrapTextWithPunct/>
<w:UseAsianBreakRules/>
<w:DontGrowAutofit/>
<w:SplitPgBreakAndParaMark/>
<w:DontVertAlignCellWithSp/>
<w:DontBreakConstrainedForcedTables/>
<w:DontVertAlignInTxbx/>
<w:Word11KerningPairs/>
<w:CachedColBalance/>
</w:Compatibility>
<w:BrowserLevel>MicrosoftInternetExplorer4</w:BrowserLevel>
<m:mathPr>
<m:mathFont m:val="Cambria Math"/>
<m:brkBin m:val="before"/>
<m:brkBinSub m:val="--"/>
<m:smallFrac m:val="off"/>
<m:dispDef/>
<m:lMargin m:val="0"/>
<m:rMargin m:val="0"/>
<m:defJc m:val="centerGroup"/>
<m:wrapIndent m:val="1440"/>
<m:intLim m:val="subSup"/>
<m:naryLim m:val="undOvr"/>
</m:mathPr></w:WordDocument>
</xml><![endif]--><!--[if gte mso 9]><xml>
<w:WordDocument>
<w:View>Normal</w:View>
<w:Zoom>0</w:Zoom>
<w:TrackMoves/>
<w:TrackFormatting/>
<w:PunctuationKerning/>
<w:ValidateAgainstSchemas/>
<w:SaveIfXMLInvalid>false</w:SaveIfXMLInvalid>
<w:IgnoreMixedContent>false</w:IgnoreMixedContent>
<w:AlwaysShowPlaceholderText>false</w:AlwaysShowPlaceholderText>
<w:DoNotPromoteQF/>
<w:LidThemeOther>EN-GB</w:LidThemeOther>
<w:LidThemeAsian>X-NONE</w:LidThemeAsian>
<w:LidThemeComplexScript>X-NONE</w:LidThemeComplexScript>
<w:Compatibility>
<w:BreakWrappedTables/>
<w:SnapToGridInCell/>
<w:WrapTextWithPunct/>
<w:UseAsianBreakRules/>
<w:DontGrowAutofit/>
<w:SplitPgBreakAndParaMark/>
<w:DontVertAlignCellWithSp/>
<w:DontBreakConstrainedForcedTables/>
<w:DontVertAlignInTxbx/>
<w:Word11KerningPairs/>
<w:CachedColBalance/>
</w:Compatibility>
<w:BrowserLevel>MicrosoftInternetExplorer4</w:BrowserLevel>
<m:mathPr>
<m:mathFont m:val="Cambria Math"/>
<m:brkBin m:val="before"/>
<m:brkBinSub m:val="--"/>
<m:smallFrac m:val="off"/>
<m:dispDef/>
<m:lMargin m:val="0"/>
<m:rMargin m:val="0"/>
<m:defJc m:val="centerGroup"/>
<m:wrapIndent m:val="1440"/>
<m:intLim m:val="subSup"/>
<m:naryLim m:val="undOvr"/>
</m:mathPr></w:WordDocument>
</xml><![endif]--><br />
<!--[if gte mso 9]><xml>
<w:LatentStyles DefLockedState="false" DefUnhideWhenUsed="true"
DefSemiHidden="true" DefQFormat="false" DefPriority="99"
LatentStyleCount="267">
<w:LsdException Locked="false" Priority="0" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Normal"/>
<w:LsdException Locked="false" Priority="9" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="heading 1"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 2"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 3"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 4"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 5"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 6"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 7"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 8"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 9"/>
<w:LsdException Locked="false" Priority="39" Name="toc 1"/>
<w:LsdException Locked="false" Priority="39" Name="toc 2"/>
<w:LsdException Locked="false" Priority="39" Name="toc 3"/>
<w:LsdException Locked="false" Priority="39" Name="toc 4"/>
<w:LsdException Locked="false" Priority="39" Name="toc 5"/>
<w:LsdException Locked="false" Priority="39" Name="toc 6"/>
<w:LsdException Locked="false" Priority="39" Name="toc 7"/>
<w:LsdException Locked="false" Priority="39" Name="toc 8"/>
<w:LsdException Locked="false" Priority="39" Name="toc 9"/>
<w:LsdException Locked="false" Priority="35" QFormat="true" Name="caption"/>
<w:LsdException Locked="false" Priority="10" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Title"/>
<w:LsdException Locked="false" Priority="1" Name="Default Paragraph Font"/>
<w:LsdException Locked="false" Priority="11" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Subtitle"/>
<w:LsdException Locked="false" Priority="22" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Strong"/>
<w:LsdException Locked="false" Priority="20" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Emphasis"/>
<w:LsdException Locked="false" Priority="59" SemiHidden="false"
UnhideWhenUsed="false" Name="Table Grid"/>
<w:LsdException Locked="false" UnhideWhenUsed="false" Name="Placeholder Text"/>
<w:LsdException Locked="false" Priority="1" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="No Spacing"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 1"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 1"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 1"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 1"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 1"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 1"/>
<w:LsdException Locked="false" UnhideWhenUsed="false" Name="Revision"/>
<w:LsdException Locked="false" Priority="34" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="List Paragraph"/>
<w:LsdException Locked="false" Priority="29" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Quote"/>
<w:LsdException Locked="false" Priority="30" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Intense Quote"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 1"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 1"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 1"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 1"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 1"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 1"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 1"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 1"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 2"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 2"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 2"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 2"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 2"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 2"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 2"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 2"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 2"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 2"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 2"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 2"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 2"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 2"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 3"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 3"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 3"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 3"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 3"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 3"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 3"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 3"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 3"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 3"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 3"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 3"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 3"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 3"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 4"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 4"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 4"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 4"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 4"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 4"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 4"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 4"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 4"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 4"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 4"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 4"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 4"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 4"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 5"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 5"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 5"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 5"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 5"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 5"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 5"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 5"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 5"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 5"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 5"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 5"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 5"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 5"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 6"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 6"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 6"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 6"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 6"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 6"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 6"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 6"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 6"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 6"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 6"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 6"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 6"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 6"/>
<w:LsdException Locked="false" Priority="19" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Subtle Emphasis"/>
<w:LsdException Locked="false" Priority="21" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Intense Emphasis"/>
<w:LsdException Locked="false" Priority="31" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Subtle Reference"/>
<w:LsdException Locked="false" Priority="32" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Intense Reference"/>
<w:LsdException Locked="false" Priority="33" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Book Title"/>
<w:LsdException Locked="false" Priority="37" Name="Bibliography"/>
<w:LsdException Locked="false" Priority="39" QFormat="true" Name="TOC Heading"/>
</w:LatentStyles>
</xml><![endif]--><!--[if gte mso 10]>
<style>
/* Style Definitions */
table.MsoNormalTable
{mso-style-name:"Table Normal";
mso-tstyle-rowband-size:0;
mso-tstyle-colband-size:0;
mso-style-noshow:yes;
mso-style-priority:99;
mso-style-qformat:yes;
mso-style-parent:"";
mso-padding-alt:0cm 5.4pt 0cm 5.4pt;
mso-para-margin:0cm;
mso-para-margin-bottom:.0001pt;
mso-pagination:widow-orphan;
font-size:11.0pt;
font-family:"Calibri","sans-serif";
mso-ascii-font-family:Calibri;
mso-ascii-theme-font:minor-latin;
mso-fareast-font-family:"Times New Roman";
mso-fareast-theme-font:minor-fareast;
mso-hansi-font-family:Calibri;
mso-hansi-theme-font:minor-latin;
mso-bidi-font-family:"Times New Roman";
mso-bidi-theme-font:minor-bidi;}
</style>
<![endif]-->
<br />
<i><span style="font-family: "Arial","sans-serif"; font-size: 10.0pt;">James
Baker, Digital Curator, British Library </span></i><br />
<br />
<span style="font-family: "Arial","sans-serif"; font-size: 10.0pt;">It is
commonplace to describe something new in relation to something that is known:
think 'motion picture', 'spaceship', 'email' or 'smartphone'. The word
'webpage' is no different. And indeed in a sense many webpages are similar to
the pages found in books or newspapers: they hold static media (text, image);
core elements of them read from top to bottom; their headers, footers,
cut-aways and advertisements orientate, guide and entice the reader; and in
URLs they possess a (relatively) unique system of identifiers. It is hard to
think of another name these digital objects could have been given.</span><br />
<br />
<span style="font-family: "Arial","sans-serif"; font-size: 10.0pt;">It is also
commonplace for the new thing to - linguistically speaking - replace the old
thing: think 'motion picture' and 'the pictures', 'spaceship' and 'ship',
'email' and '<a href="https://www.youtube.com/watch?v=jCetfaS7GAo" target="_blank">mail</a>', or 'smartphone' and 'phone'.
The same goes for 'webpage' and 'page'. Here by virtue of this act of
redefinition, the 'page' absorbs features of the webpage not (or less) possible
in book or newspaper pages: features such as dynamic content, user interaction,
and direct links to other pages (or, more precisely, other pages that are not
part of a sequence defined by the author whose work is the main content held by
the page).</span><br />
<br />
<span style="font-family: "Arial","sans-serif"; font-size: 10.0pt;">All of this
makes the webpage-cum-page appear both familiar and unsettling, conservative
and disruptive, old and new. These elements of lineage are crucial, for they
have allowed us (among other things) to think of preserving the webpage as akin
to preserving the page. Yes the challenges of novelty and disruption are
discussed and debated (on which I'm not qualified to comment), but at the most
basic level the webpage stuff that is being collected by <a href="http://archive.org/index.php" target="_blank">Internet
Archive</a> or the <a href="http://www.webarchive.org.uk/ukwa/" target="_blank">UK Web Archive</a> is page level stuff.
(This is not to say I don't think page level stuff should be archived. Far from
it, the fragility of webpages is well known (see <a href="http://ahr.oxfordjournals.org/content/108/3.toc" target="_blank">Rosenzweig, 2003</a>) and without these
efforts valuable data on our society would be lost.)</span><br />
<br />
<span style="font-family: "Arial","sans-serif"; font-size: 10.0pt;">But what are
these pages and how can historians use them? A seminar jointly hosted by the <a href="http://www.history.ac.uk/events/seminars/321" target="_blank">Digital History seminar</a> and the <a href="http://www.history.ac.uk/events/seminars/211" target="_blank">Archives and Society seminar</a> at the <a href="http://www.history.ac.uk/" target="_blank">Institute
of Historical Research</a> sought last night to tackle this very problem,
asking quite simply 'Is this a new class of primary source for historians?'.
After a presentation on the <a href="http://www.webarchive.org.uk/ukwa/" target="_blank">UK Web Archive</a> and the <a href="http://domaindarkarchive.blogspot.co.uk/" target="_blank">Analytical Access to the Domain Dark
Archive</a> project both the speakers and the audience were largely in
agreement that yes, the web archive is a new class of primary source, of
historical stuff.</span><br />
<br />
<span style="font-family: "Arial","sans-serif"; font-size: 10.0pt;">Does this
make our nomenclature for what this stuff is problematic? For to call a
webpage a page is to potentially place it into a category for which it is
ill-suited and the techniques for investigating that category under
huge-strain. Take <a href="http://www.guardian.co.uk/world/2013/jun/12/julia-gillard-abbott-sexist-menu" target="_blank">a normal news article from the Guardian
website</a> as an example. The page contains a story, framing, context and
advertisements: all very page like. But those adverts are dynamic as opposed to
static, their content quite possibly targeted depending on the IP address
accessing the URL and different each time the page is refreshed. The page also
contains moderated comments, ranked as default by oldest first but malleable to
user preferences. In short, when you visit the website it is unlikely to be the
same as when I visit the website, so an archived version can only be one possible
version of a webpage at a particular historical moment. Not very page like
behaviour. Of course we might (quite rightly for the most part) say that the
'core' of the page, the textual content that historians are likely to be
interested in will remain the same regardless of these peripheral changes. And
yet as the growth of mainstream live blogs demonstrates (such as those
covering <a href="http://www.guardian.co.uk/world/middle-east-live/2013/jun/12/turkey-erdo-an-clears-taksim-square-live-reaction" target="_blank">the Taksim Square protests</a>), the web
is moving toward dynamic content over static content as default: embedded
video, maps and text content streams are now commonplace, and are likely to
become more so as the web develops.</span><br />
<br />
<span style="font-family: "Arial","sans-serif"; font-size: 10.0pt;">The webpage
then is a rapidly evolving beast whose capacity to change whilst still being
called a 'page' complicates how we do research using webpages and how we
preserve the internet. It is a page but not a page as we knew it, a semantic
shift worth keeping in mind as we prepare for an era of born-digital historical
scholarship.</span><br />
<br />
<i><span style="font-family: "Arial","sans-serif"; font-size: 10.0pt;">This post
was first published on the <a href="http://britishlibrary.typepad.co.uk/digital-scholarship/2013/06/a-page-but-not-as-we-know-it.html" target="_blank">British Library's Digital Scholarship blog</a></span></i><span style="font-family: "Arial","sans-serif"; font-size: 10.0pt;">. </span><br />
<br />
<!--[if gte mso 9]><xml>
<w:LatentStyles DefLockedState="false" DefUnhideWhenUsed="true"
DefSemiHidden="true" DefQFormat="false" DefPriority="99"
LatentStyleCount="267">
<w:LsdException Locked="false" Priority="0" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Normal"/>
<w:LsdException Locked="false" Priority="9" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="heading 1"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 2"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 3"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 4"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 5"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 6"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 7"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 8"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 9"/>
<w:LsdException Locked="false" Priority="39" Name="toc 1"/>
<w:LsdException Locked="false" Priority="39" Name="toc 2"/>
<w:LsdException Locked="false" Priority="39" Name="toc 3"/>
<w:LsdException Locked="false" Priority="39" Name="toc 4"/>
<w:LsdException Locked="false" Priority="39" Name="toc 5"/>
<w:LsdException Locked="false" Priority="39" Name="toc 6"/>
<w:LsdException Locked="false" Priority="39" Name="toc 7"/>
<w:LsdException Locked="false" Priority="39" Name="toc 8"/>
<w:LsdException Locked="false" Priority="39" Name="toc 9"/>
<w:LsdException Locked="false" Priority="35" QFormat="true" Name="caption"/>
<w:LsdException Locked="false" Priority="10" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Title"/>
<w:LsdException Locked="false" Priority="1" Name="Default Paragraph Font"/>
<w:LsdException Locked="false" Priority="11" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Subtitle"/>
<w:LsdException Locked="false" Priority="22" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Strong"/>
<w:LsdException Locked="false" Priority="20" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Emphasis"/>
<w:LsdException Locked="false" Priority="59" SemiHidden="false"
UnhideWhenUsed="false" Name="Table Grid"/>
<w:LsdException Locked="false" UnhideWhenUsed="false" Name="Placeholder Text"/>
<w:LsdException Locked="false" Priority="1" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="No Spacing"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 1"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 1"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 1"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 1"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 1"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 1"/>
<w:LsdException Locked="false" UnhideWhenUsed="false" Name="Revision"/>
<w:LsdException Locked="false" Priority="34" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="List Paragraph"/>
<w:LsdException Locked="false" Priority="29" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Quote"/>
<w:LsdException Locked="false" Priority="30" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Intense Quote"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 1"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 1"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 1"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 1"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 1"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 1"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 1"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 1"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 2"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 2"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 2"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 2"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 2"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 2"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 2"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 2"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 2"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 2"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 2"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 2"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 2"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 2"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 3"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 3"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 3"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 3"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 3"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 3"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 3"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 3"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 3"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 3"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 3"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 3"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 3"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 3"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 4"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 4"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 4"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 4"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 4"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 4"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 4"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 4"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 4"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 4"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 4"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 4"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 4"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 4"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 5"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 5"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 5"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 5"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 5"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 5"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 5"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 5"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 5"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 5"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 5"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 5"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 5"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 5"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 6"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 6"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 6"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 6"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 6"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 6"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 6"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 6"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 6"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 6"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 6"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 6"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 6"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 6"/>
<w:LsdException Locked="false" Priority="19" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Subtle Emphasis"/>
<w:LsdException Locked="false" Priority="21" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Intense Emphasis"/>
<w:LsdException Locked="false" Priority="31" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Subtle Reference"/>
<w:LsdException Locked="false" Priority="32" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Intense Reference"/>
<w:LsdException Locked="false" Priority="33" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Book Title"/>
<w:LsdException Locked="false" Priority="37" Name="Bibliography"/>
<w:LsdException Locked="false" Priority="39" QFormat="true" Name="TOC Heading"/>
</w:LatentStyles>
</xml><![endif]--><!--[if gte mso 10]>
<style>
/* Style Definitions */
table.MsoNormalTable
{mso-style-name:"Table Normal";
mso-tstyle-rowband-size:0;
mso-tstyle-colband-size:0;
mso-style-noshow:yes;
mso-style-priority:99;
mso-style-qformat:yes;
mso-style-parent:"";
mso-padding-alt:0cm 5.4pt 0cm 5.4pt;
mso-para-margin:0cm;
mso-para-margin-bottom:.0001pt;
mso-pagination:widow-orphan;
font-size:11.0pt;
font-family:"Calibri","sans-serif";
mso-ascii-font-family:Calibri;
mso-ascii-theme-font:minor-latin;
mso-fareast-font-family:"Times New Roman";
mso-fareast-theme-font:minor-fareast;
mso-hansi-font-family:Calibri;
mso-hansi-theme-font:minor-latin;
mso-bidi-font-family:"Times New Roman";
mso-bidi-theme-font:minor-bidi;}
</style>
<![endif]--><br />Jane Wintershttp://www.blogger.com/profile/03532683090414094258noreply@blogger.com0tag:blogger.com,1999:blog-2475624895636745490.post-70062201995371116652013-02-15T10:12:00.001+00:002013-02-15T10:12:18.799+00:00Public Health in Local Government, 2001-2012<br />
This is another in our series where researchers planning to use the archive outline their projects. This guest post is by Martin Gorsky of the <a href="http://www.lshtm.ac.uk/">London School of Hygiene and Tropical Medicine</a>.<br />
<br />
<h3>
Public Health in Local Government, 2001-2012: web representations and practices</h3>
<div>
<br /></div>
<b>Background </b>The Health and Social Care Act of 2012 has introduced a major restructuring of the National Health Service (NHS). As part of this process public health duties have been removed from NHS bodies to become part of local government. The Department of Health (DH) describes the initiative as reviving 'a long and proud history' by 'returning public health home'. That history includes not only the great achievements of Victorian sanitary reform, but also the early NHS, because in Bevan's original design public health departments were based in local authorities, where they remained until the NHS reorganisation of 1974. Policy-makers see exciting opportunities for public health to further its goals within the local government setting. They hope that by being more closely connected to local peoples' needs and living environments, health professionals will be able to develop better programmes and to tackle inequalities. This, they believe, should work better than 'one size fits all' policies.<br />
<br />
<b>Historiography</b> What might the recent history of public health in local government tell us about the opportunities and challenges ahead? There is already a limited body of work examining the changes of 1974 from a national perspective, which argues that various problems had arisen. Post-war public health 'lost its way', lacking a clear philosophy and rationale at a time of rapidly changing health needs. It was also sidelined by other professional groups, delivering patchy services and failing to link effectively with the NHS. There is also a more recent history which is as yet unexamined. Since 2001, and gaining official force in 2006, the DH has championed the joint appointments of Directors of Public Health (DPH), to straddle both NHS primary care trusts and local authorities. The activities of these new appointees may give some interesting clues as to whether the structural and philosophical challenges of the earlier period retain their force or can be overcome.<br />
<br />
<b>Aim </b> The project will therefore aim to identify the web presence of these joint appointment DPHs during the period from 2001 up until the passage of the recent Act. By reading these texts it will ask:<br />
a. whether a coherent rationale for public health in a local government setting is discernible<br />
b. what practices of joint working between NHS and local government are reported, and how the benefits of integration are represented.<br />
<br />
<i>Martin Gorsky</i><br />
<i>Centre for History in Public Health, LSHTM</i><br />
<div>
<br /></div>
Jonathan Blaneyhttp://www.blogger.com/profile/15856886701364691512noreply@blogger.com0tag:blogger.com,1999:blog-2475624895636745490.post-3515114291101525762012-12-19T15:43:00.001+00:002012-12-19T15:43:52.475+00:00Exploring and Uncovering British Eurosceptism in the Dark Archive<br />
Here is another in our series of guest posts by those researchers who plan to use the archive for topics of particular interest to them:<br />
<br />
Richard Deswarte - 'Exploring and Uncovering British Eurosceptism in the Dark Archive'<br />
<br />
Britain's relationship to and subsequent engagement in the process of European integration is one of the most important political, economic and social developments of the last 50 years. This relationship has always been controversial even before the UK in 1973 joined the EEC, as it then was, and has certainly remained controversial ever since. The views and arguments of those individuals and groups who have opposed British membership, commonly referred to over the last twenty years as 'Euroscepticism' has been one of the enduring elements of British political and media debate. In the previous 15 years - exactly the period of the Web Domain dataset - much of this debate has been undertaken on the Web with many pro and anti-European groups setting up webpages and engaging in debates over the Web via blogs and other postings. To date there has been no dedicated research based on these online sites and debates. In conjunction with more traditional archival research that I am undertaking on British Euroscepticism, my AADDA project will take the opportunity to uncover and analyse the phenomenon of Euroscepticism on the Web.<br />
<br />
In doing this research the following tools and digital research methods will be utilised. In the first instance I will engage in some Google style Ngram searching based on such key terms as Euroscepticism, EU, UKIP, Euroreferendum, etc. This should produce some interesting aggregate and qualitiative results, and patterns relating to volume, timing and variety of Websites and references. Following this I will undertake some proximity searching of related terms to see if this brings up different results and patterns. In addition I am keen to see what searching under images, as one can do in the current UK Web Archive, brings in terms of results given what I suspect will be a large number of images on these webpages. In addition if time allows it will be interesting to see if sentiment analysis can be applied to gauge the degrees of negativity of Webpages/websites and how successfully it can do so. Finally I will finish by undertaking some filtering of the results based on such elements as domain type and medium type to see what and if any interesting patterns emerge. At the same time I will be open to consider trying out some of the other tools and methods that the other researchers are finding particularly successful in their case studies.<br />
<div>
<br /></div>
Jonathan Blaneyhttp://www.blogger.com/profile/15856886701364691512noreply@blogger.com0tag:blogger.com,1999:blog-2475624895636745490.post-36218730264892695962012-11-27T17:40:00.000+00:002012-11-27T17:40:06.633+00:00Sentiment Analysis and the Reception of the Liverpool PoetsThis is the latest in our series of guest posts from the AADDA researchers who are proposing ways in which the archive can inform their research. This post is from Helen Taylor of Royal Holloway:<br />
<br />
<div class="MsoNormal" style="text-align: justify;">
<br /></div>
<div class="MsoNormal" style="text-align: justify;">
<br /></div>
<div class="MsoNormal" style="text-align: justify;">
I am currently writing a doctoral
thesis entitled ‘Adrian Henri and Merseybeat poetry: performance, poetry, and
public in the Liverpool Scene of the 1960s’ (at Royal Holloway, with Professor
Robert Hampson). My work uses much archival research and oral memory, particularly
in relation to the live event and oral poetry in Liverpool at the time. </div>
<div class="MsoNormal" style="text-align: justify;">
<br /></div>
<div class="MsoNormal" style="text-align: justify;">
Sentiment analysis of the Domain
Dark Archive would be useful in relation to my work on the Liverpool Poets and
their reception by not only the mainstream media but also by those who
experienced their work at the time (in the form of memoir, via fan pages,
forums, and the like), and as such could provide me with another area of
information to consider alongside newspapers, interviews, and archival
material.</div>
<div class="MsoNormal" style="text-align: justify;">
<br /></div>
<div class="MsoNormal" style="text-align: justify;">
My main proposal for the AADDA is
for a small, self-contained, project involving proximity search. I have found
in my research that a variety of labels have been attached to the poets, and I
think it would be most interesting to see how Adrian Henri, Roger McGough, and
Brian Patten are referred to in forums and similar (informal) internet sites.
Henri is often referred to in academic material as a poet/painter, but I want
to find out how ordinary people, for want of a better word, labelled him – and
I will then combine and compare this data with searches for the same terms from
newspaper and published works, as there is a marked difference in academic and
popular attitudes to the poets. </div>
<div class="MsoNormal" style="text-align: justify;">
<br /></div>
<div class="MsoNormal" style="text-align: justify;">
Subsequent to this, I would like
to run geo-indexing analysis, to see where (as well as who and when) these
results are coming from. I would expect results within Liverpool, but it would
be interesting to see where else is recorded. It would be particularly
interesting to see if the Liverpool 8 postcode (which is where the poets were
living and working) would be an area of memorialisation. </div>
<div class="MsoNormal" style="text-align: justify;">
<br /></div>
<div class="MsoNormal" style="text-align: justify;">
This project could be important
for my research because I am approaching the literary movement from a multi-,
inter-, and cross- media perspective, to present Merseybeat poetry as ‘total
art’. In the archives in Liverpool there are flyers for events with a variety
of labels for the poets (many of which were written by the poets themselves for
events and tours), but I want to be able to provide evidence for how the people
experiencing the work have categorised the poets and I think that proximity
search will help me prove my thesis.</div>
<div class="MsoNormal" style="text-align: justify;">
<br /></div>
<div class="MsoNormal" style="text-align: justify;">
<br /></div>
<div class="MsoNormal" style="text-align: justify;">
Helen Taylor</div>
<div class="MsoNormal" style="text-align: justify;">
<br /></div>
Jonathan Blaneyhttp://www.blogger.com/profile/15856886701364691512noreply@blogger.com0tag:blogger.com,1999:blog-2475624895636745490.post-88019569169154310122012-11-12T11:39:00.002+00:002012-11-12T11:39:49.088+00:00London French Geo-Indexing and Image Tagging<br />
This is the third in our series of guest posts from researchers with proposals for how the domain dark archive can be interrogated. Saskia Huc-Hepher of the University of Westminster writes:<br />
<br />
<br />
Calculating the precise number of French people living in the capital and specifying where they live within the sprawling city has to this day never been achieved. The French Embassy itself admits to its ignorance in this respect, stating that there are approximately 120 000 individuals registered at the French Consulate in London, but that they estimate the true number of French Londoners to be somewhere between 300 000 and 400 000. I have devised several strategies to try to determine with more certainty an accurate figure, from scrutinising the number of French-native speakers in London's state schools (by borough) to examining the quantities of French citizens registering for UK National Insurance cards (by year), and my next tactic is to consult the electoral rolls of each London constituency, pending the publication of the 2011 census data (which now includes questions on identity and language). Whilst I am aware of the limitations of a geo-indexing study, that is, that it will not provide a 'hard' figure for the specified period, my hope is that targeted searches might serve to triangulate my current findings. My aim is therefore to use the geo-indexing tool to map out the areas of London with the greatest concentrations of French inhabitants on the basis of the post-codes associated with 'French' web sites / spaces. This data would have the potential to confirm either the unexpected findings of the Francophone-schoolchildren investigation mentioned above (unexpected in that the borough with the highest number of French speakers was Lambeth, not Kensington and Chelsea as the stereotype might suggest) or, on the contrary, reinforce the stereotype, as depicted in a map reproduced by the Think London (A. Wlores) report which identified Kensington & Chelsea, Westminster, Hammersmith & Fulham and Wandsworth as having the largest concentrations of French residents. It would also have historical value in that it would ascertain whether or not there was any relationship between the areas most associated with the London French today and the areas favoured in previous waves of migration to the capital. The findings could then be used in the multi-layered e-resources referred to in the context of the aforementioned AHRC bid.<br />
<br />
A study of this kind, focused on the French community in London, would be unprecedented and therefore make an entirely novel and original contribution to both academic and political spheres.<br />
<br />
In addition to the 'physical' demographic mapping process I describe above, my doctoral research will also involve a multi-modal analysis of the French community websites selected for the Special Collection. Given the inherent and increasing multi-modality of the Internet, an ethnosemiotic approach to the examination of the London French web content would seem to be the most appropriate. My intention is to depict the visual landscape constructed by the French community websites and, using semiotic theory, attempt to infer meaning from the images and draw ethnographic conclusions regarding the community's sense of belonging; how they perceive and conceive London and its inhabitants; how they (re)present and define their own identity through images; what elements of France and Frenchness they portray and promote, etc. In order to give this visual study greater temporal contextualisation and depth, I intend to conduct a parallel micro-study on the Domain Dark Archive visual data using some kind of image-tagging analytical tool which would allow a word, or combination of words, such as 'French' and 'London', to search for photographs or images only that have been uploaded onto the (London French) websites contained in the archive. This study could also serve to triangulate the findings of the geo-indexing investigation in that the images and spaces associated with key words such as 'London', or specific areas within London, may overlap with the places and spaces that were identified as being particularly French through the geo-indexing process and/or historically. This investigation would therefore be binary in its objectives: visual data for both ethnosemiotic analysis and triangulation of geo-indexing data.<br />
<br />
Further investigative mechanisms, comparable to the image-tagging search and analysis tool described above, could also be envisaged with the focus being on, by way of example, video or soundtracks. They were deemed, however, within the framework of the Domain Dark Archive, to be of reduced pertinence given that the earlier websites would undoubtedly contain less meaningful and more restricted data as a result of the technical constraints of the era. It is worthwhile considering such studies, nevertheless, for future scholarly research or AADDA pilots.<br />
<br />
Saskia Huc-Hepher<br />
Jonathan Blaneyhttp://www.blogger.com/profile/15856886701364691512noreply@blogger.com1tag:blogger.com,1999:blog-2475624895636745490.post-44256313913369980862012-10-30T11:39:00.000+00:002012-10-30T11:39:15.740+00:00PISA Rankings and public discourseThis is a guest post by another of the ADDAA researchers, Gemma Moss:<br />
<br />
<br />
<i>PISA Rankings and public discourse: Using the web domain dataset to explore how comparative statistical data have been used to set an agenda for educational change in the UK</i><br />
<br />
The Programme for International Student Assessment (PISA), is a way of comparing educational performance in different countries, by testing students at age 15 when they are preparing to leave schooling for work. Conducted at three yearly intervals by the Organisation for Economic Co-operation and Development (OECD) since 2000, the latest round in 2012 involved 64 countries including all 34 OECD members. Since their inception the rank orderings of countries’ performance has acted as a major spur to educational reform in many jurisdictions, particularly countries which collect little performance data of their own. The findings are treated in national media as international league tables, with coverage in the UK focusing on our relative position (near to the mean) and whether we have risen or fallen in the rankings. This information often enters political discourse.<br />
<br />
This project will use the potential of the web domain dataset to explore how reports of the the first four cycles of assessment in the PISA series (2000; 2003; 2006; 2009) were covered on the net. In particular the research aims are to identify:<br />
•<span class="Apple-tab-span" style="white-space: pre;"> </span>the kinds of institutions that gave most prominence to the PISA findings,<br />
•<span class="Apple-tab-span" style="white-space: pre;"> </span>how the findings were interpreted, and<br />
•<span class="Apple-tab-span" style="white-space: pre;"> </span>the extent to which they led to calls for system reform.<br />
<br />
In addition, this project will explore whether the analytic tools offered for analysing the web domain dataset enhance or hinder this form of enquiry.<br />
<br />
The research questions are:<br />
1. Can the analytic tools suggested for use with the web domain archive help establish:<br />
•<span class="Apple-tab-span" style="white-space: pre;"> </span>Which kinds of institutions were mostly likely to comment on PISA data? (Newspapers; government agencies; universities; think-tanks; individuals in the blogosphere)<br />
•<span class="Apple-tab-span" style="white-space: pre;"> </span>How the data were represented and interpreted?<br />
•<span class="Apple-tab-span" style="white-space: pre;"> </span>What the data led to in terms of ideas for system change in the UK?<br />
<br />
2. Do the analytic tools employed to answer 1. offer efficiencies of research time and scale in understanding the uptake and recontextualisation of research knowledge about PISA via the web and the knowledge communities it represents?<br />
<br />
Gemma Moss, Institute of Education<br />
<div>
<br /></div>
Jonathan Blaneyhttp://www.blogger.com/profile/15856886701364691512noreply@blogger.com0tag:blogger.com,1999:blog-2475624895636745490.post-23534284435048350952012-10-18T12:22:00.003+01:002012-10-18T12:22:55.833+01:00The Decline of Parliamentary Political Engagement, 2004-2010: implications for 2012 and beyondThis is a guest post by Carole Taylor, one of the researchers investigating the Domain Dark Archive as part of the AADDA project:<br />
<br />
<br />
I am investigating the decline of Parliamentary political engagement in the UK since 2004, a trend documented in the Hansard Society’s annual Audit[s] of Political Engagement. Public attitudes to the political process have “hardened” in recent years; for example the number of people certain that they will vote in a national election has dropped to an all-time low of 48%. My particular interest is in the impact of the work of MPs and peers in the Westminster Parliament, on public opinion; I want to be clearer about the links between political engagement and what Parliament does.<br />
<br />
In my research proposal to this consultation, I suggested four questions that the Domain Dark Archive might address:<br />
<br />
<b>One</b>: could we identify websites addressing some or all of the core indicators of political engagement (ie, knowledge and interest, action and participation, and efficacy and satisfaction)?<br />
<b>Two</b>: could comparison searches be done to give parliamentarians an insight into changing public perceptions of the parliamentary process?<br />
<b>Three</b>: can social media forums used by parliamentarians be identified in a time-sensitive way that highlights political themes commented on from one year to the next?<br />
And <b>four</b>: could we examine the House of Lords blog, say, to analyse how politicians – peers in this case – engaged with the spontaneous, seldom thought-through but increasingly influential eruptions of public opinion expressed in tweets and blogs?<br />
<br />
Given the limited amount of time we will have with the dataset this spring, I plan to focus on the last two questions, using the House of Lords as a case study not least because the Lords was the first parliamentary chamber in the world to set up a bipartisan blog (in 2008). Many peers comment on other blogs as well, and it will be interesting to chart how a discrete group of peers and public have interacted online during a period of decline in so-called political engagement. Between now and the spring I will interview peers with an interest in social media in order to identify why they got involved in blogging in the first place. This research will give me relevant key words and phrases to submit to the DDA consultation for search and analysis.<br />
<br />
<br />
Dr Carole Taylor BSc, MA, PhD<br />
taylorcm@parliament.uk<br />
<div>
<br /></div>
Jonathan Blaneyhttp://www.blogger.com/profile/15856886701364691512noreply@blogger.com0tag:blogger.com,1999:blog-2475624895636745490.post-57869244396497567422012-05-25T08:55:00.001+01:002012-06-11T09:50:35.235+01:00Workshop 3: for humanities and social scientists<b>Bookings for this event are now closed.</b><br />
<br />
Could you imagine what research questions you might be able to answer using a comprehensive archive of UK websites for the period 1996 to 2010 ? If so, this workshop may be for you, and bookings are now open. It offers an introduction to the Domain Dark Archive, a unique new research dataset, purchased by the JISC from the Internet Archive, in the keeping of the British Library, and not yet publicly available.<br />
<br />
The workshop affords a unique opportunity to learn about the DDA, and to help shape the development of the new user interface for the data. The results of these workshops will directly influence the development of the search, analysis and display tools for the new service.<br />
<br />
Where: British Library (St Pancras, London)<br />
When: Wednesday 13 June, 10.50am - 3.15pm<br />
<br />
Sessions include:<br />
<ul>
<li>Introducing the UK Web Archive at the BL, and the Domain Dark Archive in particular;</li>
<li>What is analytical access anyway, and what could it do for me ? </li>
<li>Case study: How one scholar might use the Domain Dark Archive [see <a href="http://domaindarkarchive.blogspot.co.uk/2012/04/what-on-earth-would-i-do-with-this-data.html" target="_blank">earlier post</a> for a preview]</li>
</ul>
To book a place, contact the project manager, Dr Peter Webster, at Peter.Webster@sas.ac.uk, with a brief statement of your research interests.Booking is free, but places are very limited.<br />
<br />
The event will be most suitable for scholars at doctoral level or higher, but should not be viewed as introductory research training.<br />
A sandwich lunch will be served, and we will also be able to reimburse reasonable travel expenses within the UK. <br />
<br />Peter Websterhttp://www.blogger.com/profile/11658752319509408253noreply@blogger.com0tag:blogger.com,1999:blog-2475624895636745490.post-20989636473414627462012-05-25T08:52:00.001+01:002012-06-11T09:50:45.438+01:00Workshop 2: arts and humanities<b>Bookings for this event are now closed. </b>
<br />
<b><br /></b><br />
Could you imagine what research questions you might be able to answer using a comprehensive archive of UK websites for the period 1996 to 2010 ? If so, this workshop may be for you, and bookings are now open. It offers an introduction to the Domain Dark Archive, a unique new research dataset, purchased by the JISC from the Internet Archive, in the keeping of the British Library, and not yet publicly available.<br />
<br />
The workshop affords a unique opportunity to learn about the DDA, and to help shape the development of the new user interface for the data. The results of these workshops will directly influence the development of the search, analysis and display tools for the new service.<br />
<br />
Where: British Library (St Pancras, London)<br />
When: Tuesday 12 June, 10.50am - 3.00pm<br />
<br />
Sessions include:<br />
<ul>
<li>Introducing the UK Web Archive at the BL, and the Domain Dark Archive in particular;</li>
<li>What is analytical access anyway, and what could it do for me ? </li>
<li>Case study: How one scholar might use the Domain Dark Archive [see <a href="http://domaindarkarchive.blogspot.co.uk/2012/04/what-on-earth-would-i-do-with-this-data.html" target="_blank">earlier post</a> for a preview]</li>
</ul>
To book a place, contact the project manager, Dr Peter Webster, at Peter.Webster@sas.ac.uk, with a brief statement of your research interests.<br />
Booking is free, but places are very limited.<br />
<br />
The event will be most suitable for scholars at doctoral level or higher, but should not be viewed as introductory research training.<br />
A sandwich lunch will be served, and we will also be able to reimburse reasonable travel expenses within the UK. <br />
<br />Peter Websterhttp://www.blogger.com/profile/11658752319509408253noreply@blogger.com0tag:blogger.com,1999:blog-2475624895636745490.post-62169215817225364122012-04-17T08:15:00.001+01:002012-06-11T09:51:29.424+01:00Workshop 1: for historians<b>Bookings for this event are now closed.</b><br />
<b><br /></b><br />
Could you imagine what historical questions you might be able to answer using a comprehensive archive of UK websites for the period 1996 to 2010 ? If so, this workshop may be for you, and bookings are now open.<br />
<br />
The workshop affords a unique opportunity to learn about, and shape the development of, a unique new dataset, purchased by the JISC from the Internet Archive, and in the keeping of the British Library. <br />
<br />
Where: British Library (St Pancras, London)<br />
When: Thursday 24 May, 11am - 3.15pm<br />
<br />
Sessions include:<br />
<ul>
<li>Introducing the UK Web Archive </li>
<li>How one historian might use the Domain Dark Archive [see <a href="http://domaindarkarchive.blogspot.co.uk/2012/04/what-on-earth-would-i-do-with-this-data.html" target="_blank">earlier post</a> for a preview]</li>
<li>What is analytical access anyway, and what could it do for me ? </li>
</ul>
To book a place, contact the project manager, Dr Peter Webster, at Peter.Webster@sas.ac.uk [<b>or, from May 7, jane.winters@sas.ac.uk</b>], with a brief statement of your research interests.<br />
Booking is free, but places are very limited. Bookings will close at <b>12 noon</b> on <b>May 10th</b>, and applicants will hear whether they have secured a place soon afterwards. The event will be most suitable for scholars at doctoral level or higher.<br />
Lunch will be served, and we will also be able to reimburse reasonable travel expenses within the UK. <br />
<br />Peter Websterhttp://www.blogger.com/profile/11658752319509408253noreply@blogger.com0tag:blogger.com,1999:blog-2475624895636745490.post-38833372408354771662012-04-14T09:14:00.000+01:002012-04-14T09:14:03.241+01:00What on earth would I do with this data ?We’re busy arranging a series of workshops in May and June. Their purpose is to gather humanities and social science scholars together to think collectively about the kind of purposes to which they might put a near-comprehensive dataset of the UK web domain 1996-2010.<br />
<br />
The exercise is going to involve the use of the imagination to an extent, since part of this project is to help the British Library to design a user interface for the new dataset; so there isn’t yet anything ‘to play with’, as it were. In order to help fund scholars’ imaginations, I’ve started to sketch how I myself, as an historian of contemporary British Christianity, might start to use the dataset; what questions I would like to ask of it.<br />
<br />
I come to this with a research interest in the forms of words in which religion (broadly defined) is discussed, and how those modes of discourse change over time. This can usefully be thought of using the following scheme:<br />
(i) there are some perennial issues, in relation to what we might call constitutional Christianity, taking such questions as the position of the bishops in the House of Lords, and the establishment of the Church of England<br />
(ii) there are older issues that have been ‘reactivated’ in recent years. For instance, denominational church schools were an issue as far back as the 1906 general election. After a period of calm about the issue in public discussion, the last decade or so has seen the issue come back to prominence - except, of course, that they are now known as faith schools.<br />
(iii) there are also new issues, the obvious one being the perception of a threat from radical Islamism; an issue that was simply absent until relatively recently.<br />
<br />
I personally am particularly interested in the domain dark archive, since the period 1996-2010 frames many of these issues perfectly. So, what might I ask of the archive, and which tools might I use ?<br />
<br />
<i>Basic visualisation: the Ngram</i><br />
<br />
At a most basic level, I might want to look at the incidence of particular terms, and look for periods in which a particular term is employed more often. For this, there is the Ngram; a visualisation tool that is already employed by Google, and on the existing <a href="http://www.webarchive.org.uk/ukwa/ngram/" target="_blank">UK Web Archive</a>. Consider the following case: in February 2008, the archbishop of Canterbury Rowan Williams gave a lecture to an audience of lawyers which reflected on the scope for the incorporation of sharia law into UK law. For some details of the media storm that followed, see <a href="http://peterwebster.wordpress.com/2008/03/01/rowan-williams-and-sharia/" target="_blank">here</a>. An Ngram of the incidence of the word 'sharia' in the existing selective web archive looks like this:<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh6ri4cIs7-CeK4r7ELFvCq93wZO5VCXtlnL9CYZvr-H-8feV4rcKbMNN6p9Gm5V-wtsfJug3giqUQde8HJMiUdbK46z3Jd_rK6p8aZbA_W_XhZJ4PKGSu77TpNrv5iI9385jwc8saIQs-0/s1600/sharia.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="238" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh6ri4cIs7-CeK4r7ELFvCq93wZO5VCXtlnL9CYZvr-H-8feV4rcKbMNN6p9Gm5V-wtsfJug3giqUQde8HJMiUdbK46z3Jd_rK6p8aZbA_W_XhZJ4PKGSu77TpNrv5iI9385jwc8saIQs-0/s320/sharia.png" width="320" /></a></div>
As we might expect, there is a big spike in the incidence of the term at the time of the lecture, and then heightened activity for much of the following year. I had expected the former, certainly, but not the latter to the same extent; and so I now know to look more at the content indicated by those subsequent spikes in activity.<br />
If one then looks for both of the terms 'sharia' and 'archbishop', it appears:<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjHLxce8PjoNZHxUNkLl7KwD8sjv23kkLfyZs_Ima9aelvhHD6E31JukMEMJuhP-bCq6eTXRcW97L7FHEYK6kWpYiS6hTMU5LjRNvdYLb2q02sl41iRFsXikzDG7z0ZRqgpMtN9V3b7Xs_C/s1600/archbishop+sharia.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="234" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjHLxce8PjoNZHxUNkLl7KwD8sjv23kkLfyZs_Ima9aelvhHD6E31JukMEMJuhP-bCq6eTXRcW97L7FHEYK6kWpYiS6hTMU5LjRNvdYLb2q02sl41iRFsXikzDG7z0ZRqgpMtN9V3b7Xs_C/s320/archbishop+sharia.png" width="320" /></a></div>
The spike in the terms happens at roughly the same point; but the incidence of 'archbishop' is higher, due perhaps to the wider speculation about Dr Williams' position as a result of the controversy. Also, the repeated peaks visible for 'sharia' aren't present for 'archbishop', suggesting that the debate about the former outlasted the particular instance of the lecture.<br />
<br />
<i>Proximity searching and sentiment analysis</i><br />
<br />
One might, of course, want to go further than this, and by means that aren't yet possible within the UK Web Archive. One means might be using a <a href="http://en.wikipedia.org/wiki/Proximity_search_%28text%29" target="_blank">proximity search</a> - looking for terms occurring within a certain number of characters' distance of each other in the same source. The graph above only shows the instances of the two terms, but (crucially) not necessarily occurring together <i>in the same source</i>. A proximity search would make the connection that is suggested by the graphs above much more secure.<br />
<br />
Even more interesting would be <a href="http://en.wikipedia.org/wiki/Sentiment_analysis" target="_blank">sentiment analysis</a>: gauging the attitude of the writer of a webpage towards the term employed, using various techniques including natural language processing to find terms denoting approval or disapproval occurring in connection with the search term. The present archbishop, when he retires at the end of the year, may look back on a very particular relationship with the media during his time in office. I would be interested to see whether 'archbishop' appeared more often in the data with negative connotations after the sharia controversy.<br />
<br />
These, of course, are only some attempts to imagine what might be possible using the Domain Dark Archive. I shall be blogging more as the project progresses, and the possibilities become clearer.<br />
<br />
<br />
<br />Peter Websterhttp://www.blogger.com/profile/11658752319509408253noreply@blogger.com0tag:blogger.com,1999:blog-2475624895636745490.post-63758261292551419502012-03-14T15:40:00.000+00:002012-03-14T15:40:21.163+00:00AADDA's 'elevator pitch'I'm just on my way home from a very productive meeting with all the projects in this JISC sustainability strand. One of the activities during the day was building an one minute "elevator pitch" for the project, using the <a href="http://www.alumni.hbs.edu/careers/pitch/" target="_blank">Pitch Builder</a> from Harvard Business School. And so - here it is:<br />
<br />
"AADDA is a joint venture between the IHR, the British Library and the University of Cambridge. It aims to transform the way in which humanities and social science researchers interact with the single most important archive of .uk web materials. It will develop innovative tools for analytical access to 40TB of primary data from UK webspace (1996-2010) and as a result will allow scholars to ask hitherto impossible questions of a singularly significant dataset. The archival record of contemporary Britain has increasingly migrated to a digital-only environment. The sheer volume of the record now requires new tools to render it accessible to scholars, and to unlock this unique and largely unexplored resource. In the next year, we will draw on a committed group of researchers to guide the British Library in the specification and development of tools for the analysis of the domain archive. Use cases arising from the project will be integral to the Library's sustainability strategy for the archive."Peter Websterhttp://www.blogger.com/profile/11658752319509408253noreply@blogger.com0tag:blogger.com,1999:blog-2475624895636745490.post-89168132389560280722012-02-02T10:39:00.000+00:002012-02-02T10:39:28.040+00:00The AADDA project<br />
<div style="background-color: white; font-family: 'Trebuchet MS', 'Myriad Pro', Myriad, Arial, Helvetica, sans-serif; font-size: 13px; line-height: 19px; margin-bottom: 1em; margin-top: 1em; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">
AADDA is an 18-month project to enhance the sustainability of a substantial dark archive of UK domain websites collected between 1996 and 2010 by the Internet Archive, copies of which were recently acquired by the JISC and are stored at the British Library on their behalf. The project team will work with researchers in contemporary history in particular, and digital humanities in general, to obtain feedback on the feasibility of using web archives at an analytical level. The project will build on this feedback and on the existing UK Web Archive interface in order to develop new forms of analytical access to this collection, thereby enabling researchers to carry out unique and hitherto impossible research queries. This will make a significant contribution to the global understanding of the research value of web archives, particularly for collections that span over a decade and more. Proven clarification of the utility of web archives for scholarly research will significantly enhance the long-term sustainability of the collection and provide valuable data about use cases for justification of ongoing funding.</div>
<div style="background-color: white; font-family: 'Trebuchet MS', 'Myriad Pro', Myriad, Arial, Helvetica, sans-serif; font-size: 13px; line-height: 19px; margin-bottom: 1em; margin-top: 1em; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">
The project will assess and aim to increase the acknowledged value of domain web archives for scholarly research. Following a survey of current perceptions and consultation with researchers, it will develop prototype tools for the exploitation of domain web archives, raise awareness of the material and services available, promote discussion and debate among key stakeholders, and inform future scholarly access arrangements at a domain level.</div>
<div style="background-color: white; font-family: 'Trebuchet MS', 'Myriad Pro', Myriad, Arial, Helvetica, sans-serif; font-size: 13px; line-height: 19px; margin-bottom: 1em; margin-top: 1em; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">
The project is led by Dr Jane Winters (IHR), and managed by Dr Peter Webster (IHR), working closely with Maureen Pennock and colleagues at the British Library and Dr Anne Alexander (University of Cambridge). Evaluation will be undertaken by Simon Tanner (King's College London).</div>Peter Websterhttp://www.blogger.com/profile/11658752319509408253noreply@blogger.com0