An approach for identifying and analysing reference features and spatial relations used in mountain emergency calls

The CHOUCAS project aims at helping mountain rescue services to locate a victim who describes her location by means of spatial relations with reference geographic features. In this context, the study presented in this paper aims at better understanding what reference features and spatial relations are used to locate oneself in a mountain area, with the ultimate aim of designing tools to help the rescuers. Audio tapes of real emergency calls were used as starting material. The core of the work is the design of a template to transcribe the location information contained in these calls while structuring it. A first analysis of the transcribed calls shows that projective or directional static spatial relations are the most used, and that a finer classification of reference features and spatial relations is needed. In order to synthetically present the location information contained in a call, an additional representation by means of a sketch map with a dedicated symbolisation is proposed.


Introduction
The project named CHOUCAS (choucas.ign.fr), funded by the French National Research Agency, aims at proposing methods and tools to improve the decision making process of rescue workers who have to localise victims in mountain areas from an emergency call they receive (Olteanu-Raimond et al., 2017a;Olteanu-Raimond et al., 2017b). In several situations, it is not possible to localise the victim using technologies such as GPS. In such cases, the probable location has to be built from different clues collected from what is said during the emergency call. These clues are elements given by the caller to describe his or her location, like "I am on a footpath", "I can see a lake", etc. They give indications on the location by means of spatial relations ("on", "can see") with respect to a reference feature ("a footpath", "a lake"). This way of describing a location is called indirect spatial referencing (ISO, 2003) or indirect georeferencing (Hill and Zheng, 1999). (Hill 2006, p.2) describes it as informal and used in ordinary discourse, contrary to a formal way of describing location that uses coordinates in a spatial reference system, which is call direct spatial referencing (ISO, 2003). The task of the rescuer who receives an emergency call is to transform the collected set of clues (indirect spatial referencing) into a set of possible locations expressed as coordinates on a map (direct spatial referencing). The CHOUCAS project tackles three major issues related to this process: the collection, enrichment and query of geographic data stemming from heterogeneous sources (Gaio and Moncla, 2019;Van Damme et al., 2019); the conception of geovisualisation environments aiming at supporting the reasoning of the rescue workers (Viry et al., 2019); and the conception of models for semi-automated spatial reasoning based on the clues extracted from the calls to compute an area for each clue (Bunel et al., 2018) and to fusion them to generate a probable location (a fuzzy area) where the victim can be. Underlying these issues (especially the second and third ones), is the necessity to understand and formalise the information contained in such clues in order to be able to exploit them. This is in line with the notion of Naïve Geography, defined by Egenhofer and Mark (1995) as "the field of study that is concerned with formal models of the common-sense geographic world". This paper describes an on-going study that aims at formalising and better understanding the information contained in clues given by persons who call the French mountain rescue service, based on a corpus of audio tapes of emergency calls. The rest of the paper is structured as follows. Section 2 describes the objectives of the study and the audio corpus we started from. Section 3 describes a tabular template that was set up in order to transcribe and structure the information contained in the audio calls. Section 4 presents a first analysis of the collected structured information. Section 5 reports on an additional proposal, which consists in providing a visual synthesis of the information contained in a call under the form of a sketch map. Finally, section 6 concludes with a summary and a discussion on this study.

Objectives
In this paper, we present the approach we set up to collect and analyze spatial relations and reference features from an audio corpus of about forty-five real emergency calls provided by the mountain rescue team, the High Mountain Gendarmerie Platoon of Grenoble. Listening these recordings was very useful to better understand the nature of the information and to better perceive the content of the exchanges in real situations. For example, the following elements are noted:  Callers do not describe locations with a consensual vocabulary (neither for toponyms nor for conceptual terms used for describing reference objects);  Approximate information is often provided ("I walked one to two hours");  Information can be provided in terms of negation: "there is a hut: I am not there", which is in itself useful information;  Information is given on what the caller sees from his/her position which refers to intervisibility situations;  The degree of reliability to be given to the information the caller gives may sometimes be questioned. Although very informative in their original form, these raw materials (audiotapes) obviously required being structured in a way allowing us for an efficient analysis of content that serves the different objectives of the CHOUCAS project. The goal of the information structuration is to formalize knowledge using a model that covers all the facets of information relevant to our scientific challenges.
A finer identification of salient references features and spatial relations employed by callers is also needed for the part of our work dealing with the design and development of a geovisualisation tool that support the reasoning process of rescuers. Callers can give more or less details, be sure of themselves or not when telling their trail, often employ vague, uncertain, imprecise words in their descriptions. The structuration of information is a first step to be able to provide relevant spatial data collect and delivery components for processing the emergency call.
For the collection, enrichment and query of geographic data, structuring the knowledge contributes to the constitution of a corpus of callers' expressions that guide the approaches of automatic matching of heterogeneous data. This will also help to identify, in relative positioning expressions of callers, some real objects of the terrain that are not currently explicitly referenced in data sources, but could be automatically computed by spatial analysis methods. Examples of such objects are yaw turns or crossroads that could lead us to propose semantic and geometric enrichment of geographic data sources.
For the research work dealing with spatial reasoning algorithm to compute areas based on different clues and to fusion them to determine possible localization areas, information structuration is important to explore more precisely the concepts of the terrain used as reference in real situations and in association with which spatial relations. Even slight variations in the vocabulary used, they are important to be detected to accurately transcribe the semantics of the situation the caller describes. It is not always the same to explain being under a bridge (in the sense of sheltered under the bridge) and being below a bridge (at a lower altitude for instance). In these cases, the methods to be used for transforming the associated clues into location areas are different.

Building a template to transcribe emergency calls
Our aim is not to annotate a verbal speech while keeping the words unchanged. It is to interpret the spatial information contained in it, and structure this information in a table. However, it is interesting to notice that some of the retained columns are close to the structuration of information proposed in ISO-Space (Pustejovsky, 2017), an annotation scheme dedicated to the semantic annotation of spatial relations directly in texts that has become an ISO standard.
A template was developed as a tabular file through a collective and iterative work based on the study of a few audio tapes. The template is to fill in from the listening of the audio files and allows the storage of complex expressions composed of verbs, spatial relations, references features, and modifiers. It is structured in three sections: metadata, interpretation of expressions, and spatial features.
The metadata tab describes the name of the audio file, the time of the call (e.g. year, month, day or hour and specifies if the caller is the victim or a third party. Beyond the descriptive aspect of the audio file, this section allows to collect the types of mountain activities and the real location of the victim (if the information is accessible), which can be used respectively for adapting methods that transform an indirect location into a direct location (an area) and to validate the proposed methods.
The tab named interpretation of expressions is structured into 17 columns. Three main categories can be distinguished: extract, context and expression. The first category (see Table 1) describes an extract (as heard in the audio call), identifiers of the extract, and timestamp (i.e. time where the extract occurs in the audio file). It allows to better understand the expression and to rapidly identify the time where the expression was mentioned in the audio file. We need to distinguish between extract and expressions, because sometimes two expressions are linked together and cannot be dissociated such as I'm at Grand Tollier, I'm just below" or "I'm in front of a rock which is on the left of the road". The identifiers are assigned by the transcriber. The Id expression allows to distinguish extracts having more than one expression, each expression being coded separately (a line of the table). In the example given in Table 1, the extract (Id extract = 3) contains two expressions: "I'm at the Grand Tollier" (Id expression = 1) and "I'm just below" (Id expression = 2). The second category allows describing and understanding the context of each expression. The columns of this category require on interpretation of the extract such as speaker (i.e. specify if the expression is used by the victim, the rescuer, or the (third party) caller), and a confidence (expresses the confidence allowed by the transcriber to what it said: strong, normal, or weak). In the example illustrated in Table 1, the speaker is the victim and the confidence is normal (the speaker has no doubt about his/her location). Finally, the last category allows interpreting and coding each expression. The following columns are defined: subject (identifies if the subject is the victim or another reference feature), reference feature (specifies the used reference feature), spatial relation, and modifiers. The modifiers allow to better describe characteristics of subjects, verbs (walk slowly), spatial relations (just in front) and reference features (round lake). They are intended to be used later to better tune the parameters of the methods that compute a possible area of location given an expression. Thus, four columns are defined for modifiers: subject modifier, verb modifier (Verb mod), spatial relation modifier (Spatial relation mod) and reference feature modifier (Ref feature mod).
Examples illustrating the interpretation and transcription for different expressions are given in Table 2.
The subject can also be characterized by a modifier.  Table 2: Columns describing the expressions: "I'm at Grand Tollier" "I'm just below", I see partially a lake just in front", "I have to my left a round lake", "I was walking slowly to a ski station".
Let us mention that for ternary spatial relations, two references features are present in the expression (e.g. "I'm between Rosay village and a huge bridge"). For modelling ternary spatial relations, two new columns are added for identifying the second reference feature and its modifier.
Finally, the entity tab describes for each reference feature identified in the expression its type (e.g. lake) and its name if appropriate (Lake of Roselette). This section allows identifying the types of reference features used by people as a landmark when describing their location and the followed itinerary in the context of different mountain activities. Let us mention that coding these two columns sometimes requires that the transcriber takes into account the spatial context by looking on a map and consider the whole extract or other extracts found in the same given audio file. If the caller says "I'm in Chalence", it is difficult to know the type of the feature or if the name of the entity is correct. In the given example, the correct name is Coomb of Chalence, and the type is coomb.
For each element of the template, a definition and explanation of how to fill it is provided, and different examples of how extracts and expressions should be coded into the template are also given to the transcriber.

Semantic Classification
The template has been filled with two corpora of emergency calls. The first corpus is composed of 15 emergency calls, transcribed in the template by a rescuer and partially verified. The second corpus is composed of 30 emergency calls, transcribed and verified by the authors. In total, 374 clues containing a spatial relation have been identified. The spatial relations are classified in two classes according to their semantic properties: static relations (i.e. describes a fixed position) vs dynamic relations (i.e. describes a movement or a route). These classes are similar to those proposed by Borillo (1998) for classifying spatial prepositions, but, unlike Borillo (1998), we chose to classify the spatial relations instead of spatial prepositions. This choice is motivated by the semantic variability of the French spatial prepositions (Vandeloise, 1986, p.18;Borillo, 1998, p. 84). A same preposition can be static or dynamic according to the context. For example, the French preposition "à" is dynamic in the sentence "Je vais à la montagne" ("I'm going to the mountain") and static in the sentence "Je suis à la montagne" ("I am in the mountain"). The static relations are subdivided into five classes: topologic relations (denote inclusion or connection), projective and directional relations, qualitative distance relations, quantitative distance relations (including quantitative absolutes altitudes) and visibility relations. This subcategorization also helps us to distinguish multiples semantics of a same preposition. To use again the example of the French "à" preposition, it may be static and topological, like in the expression "Je suis au sommet" ("I am at the summit") or static and denoting a quantitative distance, like in the expression "Je suis à 500 mètres du sommet" ("I am 500 meter away from the summit"). This subcategorization is inspired by the classification of spatial relations proposed by Clementini (2013). However, we chose to add to this classification a class for the visibility relations, and we distinguish qualitative and quantitative distances. Borillo (1998) proposed another classification distinguishing topological and projective prepositions. But, these two classes are too broad and the terms used to name them have a meaning different from that usually used in geomatics. Despite these modifications, some categories are still not totally satisfactory, some spatial relations being difficult to classify. For example, the expression: "I am on the bottom of the ski slope", can be classified as a topologic relation but also as a projection and directional relation. For this kind of spatial relations, we choose to categorize them as projection and directional relation, but this choice is questionable. We chose not to detail the classification of dynamic relations, indeed theses spatial relation have not yet been analyzed, especially because their modeling supposes to make hypotheses about the route followed by the victim (or the caller).

Quantitative analysis
Among the 374 identified instances of relations, 286 are static and 88 are dynamic (Table 3). Among the corpus, the visibility relations are the less used (18 occurrences). Distance relations, whether qualitative or quantitative are also rather rarely used (respectively 24 and 31 occurrences), contrary to the topologic relations (113 occurrences) and the projective or directional relations (95 occurrences). We did not identify any significant differences between the two corpora.  Each spatial relation is associated with a reference feature. In our corpus, a lot of reference features are specific to the mountain context. For example the most used reference features (around 38 %) are related to orography (e.g. "summit", "rock", or "plateau"). This category is wide, and contains objects of very variable size, such as mountains and glaciers or rocks and passes. Furthermore, some types of orographic features are missing from spatial databases, which require an enrichment work. The orographic features are mainly used with static relations, whether topologic ("in a valley") or projective ("on the other side of the ridgeline"). A majority of references features are named (around 58 %); they are generally summit or other natural features, like lakes or salient rocks. Another important part (32 %) of named entities denotes villages, city and more generally, localities. But no visibility relation refers to this latter type of entities in the analyzed emergency calls: the visibility relations are mainly related to salient natural objects, like summits or lakes. In addition to the previous categories, including a majority of references features, we note the use of immaterial objects (coordinates, altitude) in some cases. Sometimes, the caller describes her/his position using map elements, like in the expression "I am under the red dashed line". This case is rare, but it shows the importance of the referential (in that case the map) and more generally of the context in the interpretation of an utterance in natural language, as defended by Bateman et al. (2010).

Spatial relations
However, this quantitative analysis needs to be improved in further research. First, the classification of spatial relations is too broad to characterize their semantics, and by extension, to make a precise analysis. Therefore, we are currently working on the definition of an ontology of spatial relations, based on the GUM-Space ontology proposed by Bateman et al. (2010). Similarly, the classification of reference features needs to be improved: some classes, like the orography class, are too broad. We are therefore also working on the definition of an ontology of reference features, focused on mountain context. This one is based on the GEONTO ontology proposed by Mustière et al. (2011).

Utility of a synthetic visualisation
As reported in the previous section, a quantitative analysis of the terms used to locate oneself is not sufficient and a qualitative analysis is needed, in order to understand the actual semantics of a given term denoting a spatial relation in a given situation. In this respect, the transcription template, once filled for a given emergency call, gives access to the location information contained in the call in a much more practical form than the initial audio tape: only the relevant part of the information remains, it is structured and can easily be read without being constrained by the sequential aspect of the audio tape. However, looking line by line at a table containing lots of columns is still not a very easy way of getting a global view of the used reference features and terms denoting spatial relations in a call. To make the access to the situation described in a call easier, we propose to represent the location information contained in a call as a sketch map, i.e. a (schematic) map "which is drawn from observation rather than to exact scale measurements and which shows the main features of an area" (Collins online dictionary). In our case, only the victim, the caller (if different), the reference features and the related spatial relations used for location should be represented on this sketch map. One sketch map corresponds to one call, therefore if several calls are to be analysed together, one sketch map per call should be drawn. The intended users of those sketch maps are the researchers who work on the analysis of the information contained in the emergency calls (in our case, they are members of the CHOUCAS project). The intended use is to help them figure out very quickly what pairs of (spatial relation, reference features) have been used by the caller to describe the location of the victim in a call. A sketch map could then be viewed side by side with a topographic map of the zone where the reference features that have been identified without ambiguity are highlighted (Figure 1), as well as the actual position of the victim, if known from the filled transcription template, in order to understand the semantics of the used spatial relations. Several sketch maps (and associated annotated topographic maps) could also be observed together to detect cases where similar spatial relations have been used and investigate if their semantics seems similar, or on the contrary, situations that seem spatially similar but where the used spatial relations are different. Figure 1. Example of an annotated topographic map highlighting the identified reference features used in a call (same call as sketch map of Figure 3).

Designing the sketch maps
A design for the sketch maps was proposed and assessed in the context of a student project. Symbols were designed for the reference features and spatial relations that appeared the most frequently in the first corpus of calls transcribed in the template. It was assessed that there is no need to distinguish between all types of reference features, therefore all features are represented by the same red dot symbol labelled with its type and name (if known), except for features that usually have a "big" linear or polygon spatial extension and on/in which the victim can be situated (forest, road/path). Most of the relations are represented by means of labelled arrows. Figure 2 shows the final proposed symbols. The absolute position of the different elements (victim, reference features) is considered unknown, so that it is possible to draw a sketch map even if the reference features have not been identified on a map/dataset and the actual location of the victim has not been indicated in the metadata tab of the template. On the sketch map, the victim is positioned in the center, and the used reference features are positioned around. If possible, the features such that the victim is above them are positioned below, and reversely. Only one example sketch map was produced in the context of this work (Figure 3). Figure 3. Sketch map describing the following clues (the caller is with the victim): "We started from the Plateau of Pra mountain hut", "We are below the Plateau of Pra mountain hut". "We are above the Oursière chalet", "We are above the Oursière Waterfall", "We are on the Oursière Waterfall track", "We are in a forest", "We are at an altitude of 1500 meter". Blue labels = English translation, added afterwards for the purpose of this paper.

Evaluation
The proposed symbols and sketch map were assessed through a user test performed on 28 persons with skills in cartography and geomatics, therefore representative of the intended users of the sketch maps. Three test sessions were organized in sequence, first with 25 students and teachers, then with 1 then with 2 (other) researchers from the CHOUCAS team, with an improvement of symbols every time. During the first session, two tests were performed (12 and 13 persons): the tasks were respectively to interpret the sketch map of Figure 3 without its key (aim: test if the chosen symbols are intuitive), and to draw a sketch map for another transcribed call from the key (aim: test if choosing and placing the symbols on the sketch map is easy). Only the first task was repeated at the second and third sessions.
The drawing test showed that the choosing and placing the symbols from the key seems relatively straightforward. The interpretation test enabled us to improve in particular the symbols proposed for relations "in the forest" (forest represented by a filled region with a fuzzy shape and no boundary line rather than by an empty region with an ellipsoidal crisp boundary line), and "above/below an object" (label + arrow needed). The footprints indicating a place the victim comes from were logically interpreted as denoting a trip by foot. An adaptation for other modes of locomotion has to be made.
It is difficult to draw general conclusions from a test performed on just one occurrence of sketch map, but it is likely that the statement for the relations would also be valid for other relations (label + arrow needed). Further work is needed to consolidate this sketch maps proposal, among others the notion of "frame of reference" for directions (Clementini, 2013) was not addressed.

Discussion and conclusion
The context of the reported study is the need to gain knowledge on how people describe their location by means of spatial relations with reference features in the context of mountain rescue, in order to later propose tools to support the work of the rescuers (GUIs, tools to automatically compute a possible location from a clue, tools for data integration and enrichment). The objective of the reported study is to set up a method to (manually) structure information contained in emergency calls and related to the description of the location of the victim or caller. Such a method was set up and applied to a real corpus of calls: a template dedicated to the manual transcription of the information contained in the calls was defined and used to transcribe 45 calls. A first analysis of the content of these 45 transcriptions was performed and an additional representation of a call by means of a sketch map was proposed. Now, beyond the possible improvements already identified in previous sections, some issues raised during this study deserve discussion.
First, we gained feedback on the designed transcription template. Its use by several people (a rescuer and researchers part of the CHOUCAS team, students) showed that it was not so easy to get familiar with it. As a result, transcriptions of a same call by different people can be heterogeneous, which requires the transcriptions to be consolidated. This is probably due to the complexity of the template, and the fact that the examples provided with it are not necessarily read. A face to face training might be more relevant. Clues that are transcribed with difficulties are especially those where an absolute altitude is used as a reference feature (it is not natural to consider an altitude a reference feature), and those using a projective relation where the referential of directions is intrinsic (Clementini, 2013) as in "I have a lake at my left" (often wrongly transcribed as "I am to the right of a lake").
Another issue that emerged is the multiplicity of possible interpretations for a given spatial relation (e.g. "below" cf. Bunel et al. 2018), and the possible consequences in the future geovisualisation interface that will be built to support the rescuers: provided each relation corresponds to a possible choice in a scrolling list, how to ensure the interface includes the "right" number of possible choices, i.e. enough to distinguish between situations that need to be processed differently to build a possible location, but not too much so that it remains usable by the rescuer?
To enable a good interoperability between the methods dedicated to the computation of a possible location and the interface components, and because the used spatial relations and reference features are not homogeneous from one speaker (caller, rescuer) to another one, it appeared obvious that a shared vocabulary is needed. Therefore, as mentioned in Section 4, a work has been undertaken to build ontologies dedicated to spatial relations and reference features used in the context of location in mountainous areas.