AADDA Testing Report:
The French Community in London
by Saskia Huc-Hepher
1 -
Methodology
|
The initial purpose of this research was two-fold: firstly,
to use the geo-indexing tool to map out the areas of London with the greatest
concentrations of French inhabitants on the basis of the post-codes associated
with 'French' Web sites / spaces; and, secondly, to identify French community websites in the
Domain Dark Archive (DDA) appropriate for subsequent multimodal analysis on the
basis their visual and textual meaning potentialities. The ultimate objective
of the former was to triangulate the
findings of additional empirical research conducted within the framework of my
PhD, which sought to ascertain the actual numbers and hot-spots of the London
French community, thereby serving to dispel the exclusively, or at least
predominantly, South Kensington myth. Whilst the aim of the latter was to
scrutinise the visual landscape of the London French over the period of the DDA
data set, as (re)presented through the images – still or moving, in parallel to
the technological advances of the Internet – displayed on the French community
websites found in the DDA. It was envisaged that this historical visual data
would provide the study with greater temporal contextualisation and depth, and,
using social semiotic theory, in particular multimodality, would allow meaning
to be inferred and ethnographic conclusions drawn from the images, on such
subjects as the community's sense of belonging; how they perceive and conceive
London and its inhabitants; how they (re)present and define their own identity
through images; what elements of France and Frenchness they portray and
promote; and whether any of these have changed over time.
Similarly, it was
hoped that the geo-indexing analysis would be of historical value, determining
whether or not there was any relationship between the areas most associated
with the London French today and those districts favoured in previous waves of
migration to the capital.
The final objective
of the DDA research proposed here was for the image-tagging analytical tool to
enable a word, or combination of words, such as 'French' and 'London', to
search for photographs or images only, the visual data thereby potentially
serving to triangulate the findings of the geo-indexing investigation in that
the images and spaces associated with key words such as 'London', or specific
areas within London, could have coincided with the places and spaces that were
identified as being particularly French through the geo-indexing process and/or
historically. This micro-investigation was therefore to be binary in its
objectives: visual data for ethnosemiotic analysis and geo-indexing data for
triangulation of previous qualitative research.
The methodology
outlined above was adopted on several occasions over the course of the AADDA
project time-span: firstly in March 2013, later in August 2013 and September
2013, with a final trial, using the most functional interface and comprehensive
data set, in October 2013. The results, at every stage, however, were
disappointing.
2 – Deep Search Data Testing
|
March 2013
The first trial session was carried out in the knowledge
that at that point in time the DDA included only a random subset of the entire
cohort of data, but one which was evenly spread over the archive in temporal
terms. Therefore, in theory, trends, developments and patterns should have been
identifiable, despite sentiment analysis and geographic options not being
available at that stage. In practice, however, a number of basic search hurdles
prevented any valuable findings from materialising. These included:
- the
lack of clarity regarding the need to click on the crawl date to access a
website; choosing the website title would have been more intuitive. Such
functionality was updated at the subsequent meeting (21/03/2013);
- the
lack of clarity regarding the purpose of the bar charts at the top of the
page; they have since been removed;
- the
fact that not all web captures functioned at that time – e.g. Le Petit
Parisien restaurant had no images and almost no text (but enabled me
to do a current Google search for the website, only to find out that the
restaurant – and website – is now closed; this is therefore an example of
the potential historical worth of the DDA, had it been operating
correctly, in allowing the analysis of obsolete Websites);
- some
websites cited in the list of 'hits' subsequently being found to be
unavailable; the links to alternative sites proved to be useful, however;
- time
being wasted revisiting Websites which had already been scrutinised. Once a
site has been viewed, it would be helpful and more time-efficient if the
visited link appeared in a different colour (e.g. purple, cf. Google) from
the others on the list;
- the
fact that search tools operated extremely slowly and the interface was not
yet user-friendly. Speeds and appearance have since improved and the
latter is no doubt a work in progress;
- http://web.archive.org/web/20080601000000*/http://www.guardian.co.uk/world/2008/jul/12/france.islam
Here, every separate date in the July (burka scandal) peak (as well as all
the other dates in August and October 2008, the two snapshots available
from 2009 and the single one from 2012) showed the same snapshot
from The Guardian (12 July 2008). If the online material is
unchanged in relation to another date, this should be immediately visible
on the list of data (possibly via colour coding, as suggested for the
pre-visited Web pages, or grouping by content & date);
- the
majority of search results not being particularly useful for my purposes;
they were either not relevant (for instance displaying large numbers of
Websites related to French tourism for English users) or not
French-specific (that is, 'Londres' retrieved results in Portuguese,
Spanish, etc., not French exclusively; while English search words
retrieved sites aimed at Francophiles as opposed to Francophones);
- phrase
searching using the “double inverted commas” being equally disappointing
(nothing of relevance was found following a search for “French community
London”, or indeed '“French” and “community”', trialled at a later stage);
“French London” was therefore tested, resulting in a list of sites
relating to French teachers & jobs in London.
Conversely, it was useful to have the
'media' / 'pdf' search options at the bottom of the screen, as this enabled
access to images and audio 'texts' (of relevance to the multimodal methodological
/ theoretical approach taken in my research);
Overall, the initial testing was found to be useful in
assessing the lasting impact, or otherwise, of the French community on London,
in a temporally comparative manner. That is, by identifying French restaurants/cafés/businesses
through their retrospective on-line presence before submitting the titles to a
live Google search at the time of testing, I was able to discover if such
enterprises were growing, in decline or defunct. Whilst that limited use was of
potential value to my research in assessing the lasting contribution of French
businesses to London's cultural and economic landscape, I was nevertheless
acutely aware (given my curation of the London French Special Collection for
the UK Web Archive) of the mass of relevant data – such as community websites
and blogs – which had not been detected or listed as featuring in the DDA. It
was hoped at the time that this was due to the incomplete and arbitrary state
of the data set.
August 2013
This trial was more successful than the last as regards the
speed and efficiency of the data search tools, despite there still being only a
five per cent random, if temporally representative, sample of websites
available. Somewhat paradoxically, those searches which pinpointed the early
years of Internet use, namely 1996 and 1997, proved to be the most valuable.
Several different searches were tested on this occasion, as follows:
a) A search for the terms “French
community” was filtered by language, using the “French” option. This
functionality was found to be extremely useful in reducing the large amount of
irrelevant data to a more manageable subset. Again, by filtering further, this
time by year (in this case 1996 and 1997), I was able to focus in on yet more pertinent
Web pages. Thus, when I began to analyse the <Associations Françaises>
site, I noted that the landing page directed the visitor to separate sites, one
for French expatriates and one for Belgians. Not only are these sites an
indication of the relative establishment of the said Francophone communities in
the UK, each warranting an on-line home for the long list of associations set
up in the country of residence, but the fact that a distinction is made between
Belgian and Franco-French populations has implications regarding identity.
Using the same search terms, another site <Les
Grenouilles Cablées>, harvested in 1996, proved worthy of an initial
analysis. Firstly, the landing page pointed the visitor in the direction of
three separate sub-sections: <Grenouilles du monde>, <Grenouilles des
USA> and <Grenouilles de Californie>. These distinctions suggest that
either the French expatriate community was more significant in the USA than
elsewhere at that time (including London, which is no longer the case and
perhaps related to the opening of European borders) or that US residents,
including French ones, were earlier adopters of Internet technology than in the
UK. When examining the site more closely and entering the
<Grenouilles du monde> space, it was telling that the
first choice was then <Nouvelles de France> (before the hyperlink to
Quebec), which suggests that this website is indeed aimed at the French expat
diaspora worldwide, linked together by their shared affinity to France, and
keen to maintain links with the homeland. Further, when choosing the French
news link, the selection of newspapers available was a left-leaning one. Again,
the possible implications of this are two-fold: either the political leanings
of the newspapers featured are an indication of the papers' social commitment,
i.e. making information freely available to all, or they are an indication of
the profile of the diaspora visiting on-line sites at that time, i.e. Libération
and Charlie Hebdo both target a young, left-wing readership. If this is
the case, it is thus a profile at odds with the predominantly right-wing
(particularly at that time) expat community of the South Kensington stereotype,
which serves to substantiate the hypothesis posited at the beginning of this
report. There are also hyperlinks to <Metéo France> (suggestive of a need
for a physical sense of proximity to the homeland, despite the geographical
distance separating the community from it) and to <Les dernières nouvelles
d'Alsace' and <Pariscope>, both of which could be indicative of a longing
for insignificant local minutiae in the globalised age, made possible through
the worldwide Web, as well as pointing towards greater emigration from eastern
France (and Belgium, as confirmed by the first website) and the French capital
than other geographical zones.
This site offers links to French audiovisual sites
including radio and TV and, perhaps more importantly for my research, to two
on-line fora, <French Talk> and <Francopolis> which are evidence of
the formation of both Internet and French communities (despite other
empirical evidence suggesting that the French community per se does not exist,
or if at all, in South Kensington alone). Finally, this website creator's
recommended sites are telling in terms of identity (just as a Blog would be
today in its related networks) especially within the theoretical framework of
Pierre Bourdieu's Habitus, with the Vatican, Charlie Hebdo, the RATP (equivalent to TFL in London) and various
French sports sites (football, Formula 1 and rugby) featuring among others.
Another site displayed following this search was the
<Association des Francophones de Cranfield> in which advice is provided
on low-cost means of transport to France and Belgium. This in itself
demonstrates that the target audience are medium- to long-term French residents
of the UK, rather than short-term visitors, and that they have been attracted
to England by its (Higher) education system – a point which, as incongruous as
it may appear, is compounded in the qualitative data gathered outside the AADDA
project.
b) The second search undertaken in the
August trial was “London French” by “content type”, notably “image”. This was
highly disappointing and of little use given that the few images which were
displayed related to French football or simply contained a set of codes, with
no discernible image.
c) To counter the insufficiency of the
image search above, a “format search” was instead chosen from the AADDA
homepage. This was more successful in terms of number, with some 6,369 items
listed for the “French London + format” search trialled, filtered by year
(2006). However, given that the images were not tagged and stood in complete
isolation, their usefulness was questionable, as many appeared to relate not to
the French community in London, but linked to websites on French property or university
Webpages.
d) This search attempted to assess the
value of the post-code filter, which initially was again rather disappointing.
Given the lack of pertinence of the majority of the sites identified after the
early years (1996, 1997), their related post-codes were of equal irrelevance.
Furthermore, there were no apparent clusters of London websites, with many
coming from outside London; no
micro-geographical/demographic conclusions could therefore be drawn. A
subsequent search (“French community” filtered by language and year), despite
listing only one Website, revealed two potentially telling post-codes, N7 and
NW5, for 2010, which could have been related to the forthcoming opening of a
new French State school in Kentish Town (NW5) (but the insignificant numbers
involved are again inconclusive).
e) A search for “communauté française”, filtered by year (2001) and
language (French) identified a Blog, which would have been of particular
pertinence to my research. However, it transpired that the said Blog was the
work of an English-speaker, practising their written French, rather than a
French Londoner's Blog. The lack of Blogs retrieved by the DDA search engine
was perplexing, as many are known to me within the framework of my UK Web
Archive Special Collection work. The question of whether this is due to the
domains favoured by the London French Bloggers as hosts for their
autobiographical logs is therefore worth consideration, and if so, the
possibility of accessing them through the DDA should also be contemplated.
f) The same search as in item (e), this time written in and filtered by
the English language for the year 2010, found only one Website, the <Ile aux
enfants> school in North London. Despite the unexpected limitedness of the
search results in this case, the “links to host” tool was telling, particularly
in terms of “mapping the field” and Bourdieu's “three-stage analysis” paradigm.
That is, by scrutinising the – predominantly institutional – list of Websites
linked to the <Ile aux enfants>, such as <ambafrance>,
<assemblee-afe>, <bienvenuealondres> and <edufrance>,
socio-cultural assessments were facilitated. Nevertheless, it was frustrating
that these links to the host site were not functioning during the trial,
directing the visitor back to the host page as opposed to opening the linked
Webpage itself. It was not clear, therefore, whether their inclusion was
exclusively for quantitative analysis (the number of visits was in brackets),
as they were of no qualitative worth without access to the content of the
linked Websites.
September 2013
The most notable and satisfying difference between this trial and the
preceding ones was that all the links to related Websites were at least
partially, and in the great majority of cases completely, successful. This
meant that the discovery of one website (from
a long list of still relatively
futile others), namely the “Londoscope” reference pages of the
<www.acticours.freeserve.co.uk> proved to be invaluable through its
hyperlinks, as opposed to the content of the site itself. Thus, several
pertinent results were attained, as detailed below:
a) The apparition of London French social-networking-type pages, known as
<Londoscope> is perhaps indicative of the growing numbers of French
Londoners seeking a physical sense of community by means of digital linking and
dissemination mechanisms. Entries such
as “Eglise protestante française de Londres: Soirée anti-stress” and the enumeration
of French films on show at the Ciné Lumière and the NFT, together with other
French cultural events at the Institute of Contemporary Arts bears witness to
the importance of French culture to London's overall cultural capital and is
also evidence of community belonging in
practice.
b) The <Londoscope> pages from 2003 enabled the identification of a
culturally and historically pertinent French amateur dramatics group which has
been performing in London since 1929: Le Cercle dramatique français (CDF). My
research into this amateur theatre company can now be taken forward in an
effort to ascertain whether it is still in existence and, if so, its place in
French community life today.
c) Another link on the same Website, from 2004, referred to the Francophone
television channel TV5 celebrating its 20th anniversary and revealed
some useful viewer figures, including it being watched in 167 million
households in 2003, with some 56 million weekly viewers. This constitutes
further evidence as to the impact of the French language and culture worldwide
and potentially to the growing French diaspora.
d) The final finding of relevance during this trial session was the
<Londoscope> link to the ADFE (Association Démocratique des Français à
l'Etranger), created in 1980 'par des Français qui voulaient, pour les
représenter, une association dynamique et correspondant aux nouvelles réalités
de l'expatriation' (i.e. by French people who sought representation through a
dynamic association in tune with the new realities of expatriation). This
quotation alone is of worth for a number of reasons; firstly the notion of
'representation' itself is key, as it begs the question of 'representation to
whom?', which, reading further, it appears is to the French authorities.
This in turn indicates that the need to be politically represented in France
has its roots much further back historically than the election in 2012 of the
first ever Député for
French overseas residents implies, as well as demonstrating an unwillingness to
integrate fully in the London socio-political scene and an attachment to the
homeland. Similarly, the notion of “new realities” suggests a shift from an old
form of migration to a new one, acting as a temporal forerunner to the massive
wave of cross-Channel immigration which began in the early nineties and
continues to this day. The term “dynamique” could also be seen to illustrate
the London “pull factor” for French expats living in the capital; that is, many
are arguably escaping the inertia and complacency of French institutions and
mindsets in their decision to emigrate to London, as exemplified in other forms
of empirical evidence gathered for this research. Here, therefore, the data
gathered from a single Website in the DDA has served to triangulate several key
findings in my PhD.
October 2013
Having exhausted most of the available search options during the
previous trial sessions, this was the shortest and least enlightening of all.
It was necessary, nonetheless, to conduct a final test
with the most functional
interface to date and a now complete data set. The search tools also provided
an opportunity for sentiment analysis, unavailable in previous trials.
This experiment
involved a phrase search for “London French community” combined with “English language”
and “very negative” sentiment filters. No results were identified. When the
French language was used and chosen as a filter, 240 matches were found, but
these were of little relevance to my research given their pedagogical focus.
One potentially valuable find for historians of the French presence in the UK
was a Website on Augustine monks, in which the flight of monks from France
during and after the French Revolution, and the creation of brotherhoods in
York (1802), Bristol (1818) and Ealing (1897), where a Benedictine monastery
was founded, were reported. However, in view of the contemporary emphasis of my
research, this proved of little relevance, once again.
Further searches,
using different phrases/words, content types and language/sentiment filters
were also trialled, to no avail. Furthermore, it was disappointing to note that
the post-code and media filters appeared to have been removed, or were not
readily visible.
Overall, if not the
least successful of the trials conducted to date, this was the most frustrating,
given the unfulfilled aspirations of working with the complete data set.
3 -
Lessons Learnt
|
The lessons learnt from this exercise are as follows:
- “Think
small” – minimising one’s research objectives is perhaps the only way of
navigating the enormity of the data.
- Maximise
material – as the deep search process is akin to searching for the
proverbial needle in a haystack, any relevant data identified as being
pertinent should be analysed immediately, or saved for subsequent analysis,
due to the apparent randomness of the retrieval process.
- Use
big data for its quantitative value, but not for drawing representative
conclusions or in an attempt to test large-scale hypotheses, due to the
apparent fallibility of the findings. Therefore, restrict qualitative
research to the micro-findings of those Web sites and Web pages found to
be of value – albeit somewhat arbitrarily – and optimise this data for its
comparative and preservation worth.
4 -
Future research and AADDA Recommendations
|
As regards my own research, I intend to explore the identity / Habitus
evidence found in early Websites (1996 / 1997) in greater detail and compare it
with contemporary Blogs to establish whether the same affiliations are present
and the same sense of group, or otherwise, identity. These findings will also
be compared and triangulated with the qualitative data gathered from one-to-one
interviews with members of the contemporary French population in London. It is
also possible that I will study sample historical Websites / Webpages alongside
their contemporary equivalents, from a multimodal perspective, to gain an
understanding of how technological constraints might influence the making of
meaning to varying degrees over time.
It is unlikely that the post-code filter searches will be used to
inform my research, given the weakness of the findings, but the process was
worthwhile in its disproving of my theory, and some cautious, small-scale
conclusions could be drawn from the associations with the NW5 district.
With respect to the AADDA project looking forward, the following
recommendations have been tentatively made:
- (colour?)
coding to indicate both sites already visited and replica Webpages
(identified repeatedly according to the sweep date)
- inclusion
of a Blog filter (in addition to the <.org.uk>, <.ac.uk>, etc.
domain filters
- “links
to host” tool to open link in new window
- retention
of the post-code and media type (image, audio, video, etc.) filters, with
tags / provenance if possible
- user-friendly
search “help” / search “tutorial” function as cursor hovers over certain
fields (such as host links, domains, numbers, etc.) on the deep search
landing page (giving particular advice regarding correct wording and
punctuation, for example)
5 -
Conclusion
|
The lasting impression, having carried out several trial sessions using
the DDA data and its current search tools, is that the results can present
islands of valuable resources within a sea of irrelevant material, but that the
likelihood of finding them is dictated by chance rather than design. Throughout
this testing process, I have pondered the reason for the seemingly arbitrary
nature of my AADDA findings and for my failure to access a greater amount of
material relevant to my research; that is, the question of whether my lack of
technological expertise was the cause or whether such outcomes are inherent to
searching this vast set of data has been recurrent and remains unanswered.
Instructions offering clear guidelines on the best ways to use the archive and
acknowledging its limitations would therefore be both helpful and reassuring to
researchers.