BACHELOR'S THESIS. GIS to show when, where and how death occured. Jenny Hagman. Högskoleingenjörsprogrammet Geografisk informationsteknik - PDF

Description
2000:78 BACHELOR'S THESIS GIS to show when, where and how death occured Jenny Hagman Högskoleingenjörsprogrammet Geografisk informationsteknik Institutionen för Institutionen i Kiruna Avdelningen för -

Please download to get full document.

View again

of 38
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Information
Category:

Essays

Publish on:

Views: 22 | Pages: 38

Extension: PDF | Download: 0

Share
Transcript
2000:78 BACHELOR'S THESIS GIS to show when, where and how death occured Jenny Hagman Högskoleingenjörsprogrammet Geografisk informationsteknik Institutionen för Institutionen i Kiruna Avdelningen för :78 ISSN: ISRN: LTU-HIP-EX--00/78--SE GIS to show when, where and how death occured Jenny Hagman GIS-3 GIS-engineer programme, Kiruna Luleå University of Technology Document: report v4.doc Date: Page 2(37) Preface This report is submitted in fulfilment of an undergraduate project at the GIS (Geographical Information System) engineer study programme. The program is located in Kiruna and belongs to Luleå University of Technology. I did the undergraduate project in the spring of 1999 at SMC (Spatial Modelling Centre), Kiruna. Page 3(37) Table of contents 1 ABSTRACT SAMMANFATTNING INTRODUCTION AIMS METHODOLOGY THE APPLICATION CREATING NEW TABLES Table 1: Region, year and population Table 2: The number of dead people, cause of death, year, region THE GRAPHICAL USER INTERFACE The outline map The views CONNECT THE APPLICATION TO THE NEW TABLES IN THE DATABASE A GUIDE THROUGH THE APPLICATION RESULTS DOES THE APPLICATION WORK? IS IT EASY TO USE? DISCUSSION CONCLUSIONS ACKNOWLEDGMENT APPENDIX APPENDIX APPENDIX APPENDIX APPENDIX Page 4(37) 1 Abstract SMC has a database, referred to as TOPSWING (Total Population of Sweden Individual and Geographical database), since The major obstacle in using the database by a wide range of users, is the inefficient way the data can be accessed and displayed. This undergraduate project aims at creating an application that simplifies this process in the content of researching death causes. Page 5(37) 2 Sammanfattning SMC har en databas, här åberopas som TOPSWING (Total Population of Sweden Individual and Geographical database), sedan Det största hindret som uppstår när ett flertal användare anropar databasen är det ineffektiva sätt som data behandlas och visas på skärmen. Detta examensarbete har som mål att skapa en applikation som underlättar denna process i undersökningen av dödsorsaker. Page 6(37) 3 Introduction SMC is a research centre that is governed by Umeå University and Luleå University of Technology. SMC has, in contrast to many other research institutes, focused its research activities on the human factor of environmental changes. Work at SMC is being undertaken on two fronts: international level of pure and applied research and application-driven studies. Several international projects are now assembling natural science databases about the environment. At SMC these will be complemented by timeand space- specific information about the population and its activities. SMC will utilise the rich Swedish statistical resources in order to test the potential of databases of micro data for this type of research. Methodological developments will be stimulated within, for example, time-geographic microsimulation, geographic information techniques, data languages and artificial neural networks. SMC is affiliated with the Environment and Space Research Institute (MRI), a research and development project under the Swedish Council for Planning and Co-ordination of Research (FRN). MRI has further affiliations. These are Atmospheric Research Program (AFP), Climate Impact Research Centre (CIRC) and Environmental Satellite Data Centre (MDC). SMC has a database that contains socio-economic information about all people in Sweden during the years The database is called TOPSWING (Total Population of Sweden Individual and Geographical database). The kind of information that exists in the database is for example where a person lives, income, education level, migration, civil status, sex and age. If a person is dead there will be information about the cause of death, what year the person died and so on. Microsoft SQL server handles TOPSWING. The data in the database comes from Statistics Sweden and the database was created in 1997 by merging different registers. The main aim of SMC is to develop a microsimulation model of Sweden using the database as support. By using Microsoft Access, SQL graphics edition or SPSS 1 information from the database can be reached. The most common way is to use Microsoft Access and write SQL-statements 2. The processing time of a SQL-statement can be very long considering the large database tables. One part of this undergraduate 1 SPSS = a statistical program 2 SQL = Structured Query Language, a query language that is used to get information from a database. Page 7(37) project was to make the search procedure from the database easier and less time consuming. The particular application that was studied was death causes. No analysis was undertaken by me, where, when and why is up to the scientist to decide. This undergraduate project had it focused on death causes. Where and when, why is up to the scientist to decide. Page 8(37) 4 Aims The aim of this undergraduate project was to create an application in ArcView 3.0 using the program language Avenue. The application can be used in the starting process of a research project or for investigation of unknown patterns. It is important to study exploratory analysis of the data. In the current setting this is not as easy as it should be. Not only manipulating several tables in the database but also exporting the data to different systems of softwares is very tedious and time consuming. In the application, the user should be able to choose a cause of death, a year and one or two regions in Sweden. These are the different divisions of Sweden: All of Sweden County Municipality Parish The result will be shown as the rate, which is the number of people that died in the selected region divided by the number of people in the region times (Figure 1) The number of people that died in the selected region(s) during the selected year Population in the selected region(s) during the selected year * 1000 = rate Figure 1 The result will be shown in a bar chart. Page 9(37) 5 Methodology My approach on this undergraduate project, and which makes this project special, is the construction of new tables to make the search from the database much faster. Previously the user had to write long SQL-statements to get a result. To search through a database that consists of every human being in Sweden during eleven years takes a long time. After the SQL-part, the user had to import the data to ArcView to be able to do the graphs/investigation. If the user wanted to do the same thing for another cause of death, the only way was to start all over again. Of course this took a long time. By making an application where it will be easy for the user to get a result without being an expert on SQL or the database, the number of people who will use it would hopefully increase. The method I used during the project can be divided into three steps: 1. Create new tables by making SQL-statements to the database. 2. Make the graphical user interface (GUI) to the application Different scales (All of Sweden, county, municipality or parish) Point and click features Visual representation of rate. 3. Connect the application to the new tables in the database. Resources used during the project are ArcView 3.0 and its program language Avenue, the extension Dialog Designer, TOPSWING, Microsoft Access and Microsoft Word. Page 10(37) 6 The application 6.1 Creating new tables Because the tables in the database did not suit the requirements, new tables had to be created. This was important because the selection from the tables must be fast. If it is too slow, the user will probably get annoyed and stop using the application. So this was a very important step. If the selection from the new tables was not fast enough would show further on. The new tables were created by SQL-statements using the existing tables in the database. One big problem with this was the time requried. In the worst case it could take a day or even more to get an answer and then maybe realise that the question was wrong The new tables contain: The population of a region a single year. How many people died in the region due to a specific cause of death during a specific year Table 1: Region, year and population The population of a region a single year. For this table the region is essential. A region is not represented as a single number in the database. It is divided into three different fields. These are bcountyno, bcommunityno and bparishno. (b stands for bostad, which is the Swedish word for home.) For tables and their fields, see Appendix 1. The first step was to create a table (jha_areas_alive) which would show the number of living persons in each area each year. Data was extracted from PersonYearOccupation. The SQL-statement for this new table can be seen in Appendix 2. PersonYearOccupation pid year bcountyno bcommunityno bparishno etc. Figure jha_areas_alive count(pid) as alive year bcountyno bcommunityno bparishno Figure 2 Page 11(37) The difference between the two tables may not seem big, but it is. The small word etc. represents a number of fields that does not exist in the table jha_areas_alive. By comparing the data from the new table, jha_areas_alive, with data from Sweden Statistics, accuracy could be assessed. In general you could say that the data became more and more accurate with time. The data from the table regarding the middle of the 1980s were more often failing to correspond to the data from Sweden Statistics. In general the data are more correct in the 1990s. This depends on the way of collecting data. Data from the 1990s are for example more accurate than data from the 1980s. To be able to search for a region in one field it was necessary to merge the three fields for county, municipality and parish into one field. Merging the fields will mean that the new field will have a number containing five or six digits depending on how many digits bcountyno contained. The system to do this could be seen below: bcountyno bcommunityno bparishno Parish AA BB CC Municipality AA BB 00 County AA All of Sweden Figure 3 Figure 3 shows that County is getting its data from bcountyno and only from there. It would not need any data from bcommunityno or bparishno so these empty places will be filled up with zeros. For Municipality it is necessary to know the bcountyno the selected municipality belongs to and the number for the bcommunityno itself. But the data from bparishno is irrelevant. Parish needs data from all fields. For All of Sweden on the other hand, a selection of a specific field is not necessary because every field should be selected. The table jha_regions_populations was created with a SQL-statement. This table contains region-field (at this point the table only contains county), year and the sum of all living persons (population). By using the command INSERT in a SQL-statement, municipality and parish could be added to jha_regions_populations. Page 12(37) To get the number of the region: County bcountyno * Municipality bcountyno * bcommunityno Parish bcountyno * bcommunityno * bparishno Figure 4 It is possible to add data (INSERT) for All of Sweden by setting Region to zero. By making a group by year and only on year, the population on All of Sweden is received. Appendix 3 shows the SQL-statements for this new table and also parts of the results. jha_areas_alive count (pid) as alive year bcountyno bcommunityno bparishno jha_regionspopulations population year region Figure Table 2: The number of dead people, cause of death, year, region How many people died in the region by a specific cause of death in a specific year. This new table had to be created in two steps. In the first step the table jha_pd1 was created and data from the original table PersonDeath was used. The following fields was selected: the personal ID (pid) for the persons who died, the year before the persons died (lastyear) and cause of death (undorsak). The SQL-statement for the creation of jha_pd1 can be seen in Appendix 4. The reason that the wanted year is not the year the person died, but the year before, is that if a person dies during a year there will be no data for that person that year. But if the selected year is the year before, when the person was still alive, then there is data. All tables with fields can be seen in Appendix 1. Page 13(37) PersonDeath pid DeathDate Undorsak etc. jha_pd1 pid lastyear undorsak Figure 6 Step number two, which is the last step, was to create the table jha_pyo1. This table had six fields. These were pid from the original table PersonYearOccupation, lastyear and undorsak from jha_pd1, bcountyno, bcommunityno and bparishno, all from PersonYearOccupation. By using JOIN it was possible to merge PersonYearOccupation with jha_pd1. The connecting part was pid from the two tables and lastyear from jha_pd1 and year from PersonYearOccupation. The reason to join on lastyear and year is that the data from lastyear is connected to the year a person died. The SQL-statement to create jha_pyo1 can be seen in Appendix 4. jha_pd1 pid lastyear undorsak JOIN PersonYearOccupation pid year bcountyno bcommunityno bparishno etc. Figure 7 jha_pyo1 pid lastyear undorsak bcountyno bcommunityno bparishno Page 14(37) Now the table jha_pyo1 and jha_regionspopulations are ready to be used! As shown before, the tables look like this: jha_pyo1 pid lastyear undorsak bcountyno bcommunityno bparishno jha_regionspopulations population year region Figure 8 Unfortunately I was forced to exclude the years of 1985 and 1986 from the application. For some unknown reason, these two years lacked a lot of data. The risk for producing incorrect results was considered so large that this action was necessary. 6.2 The Graphical User Interface The outline map The map that is used in the application is obtained from the red map (Lantmäteriverket). Its original scale is 1: In the application the scale changes constantly. For example when the user uses the button zoom in. The data from the red map is divided into four layers. The first one is All of Sweden. Here are the borders of the country from the county-layer is selected and added to a new single layer. County, municipality and parish already existed. A person that is familiar with Sweden only needs a quick look at the map to see that something is very wrong. It is the coastline that is incorrect. This is due to the fact that the borders for county, municipality and parish stretch out into the water. (See figure 9.) All the islands that surround Sweden must also belong to a parish, a municipality and a county. As a result the boarders stretches out into the water. Attempts to correct this, so that the boarder would be were the shoreline is, failed The views The Graphical User Interface (GUI) shows a map over Sweden (similar to the one described above). Beside the map there are several comboboxes. A combobox is a box that for example contains a list of several different areas. Several comboboxes can be seen in figure 9. In the top combobox in the main view, the first combobox, the user can choose between the different scales, all of Sweden, county, municipality or parish. Cause of death can be Page 15(37) chosen from the second combobox. The third combobox is for region 1 and the fourth combobox is for region 2. By pressing the button 2 the user can study two regions. The fourth combobox is the only combobox that is optional. The user can also select region/regions by first clicking on the tool 1 for Region1 or 2 for Region2 and then click on the map for the wanted region. The name of the region will then appear in the combobox. If this is not the region that you had in mind you can click on the map again for your region. This can continue until your region is found. The last thing to select is year, which is available through the fifth and last combobox. When the user has made the selections, he/she presses the button Go! and the connection to the database is established. As the processing is done, the result is shown as a bar chart. The bar chart will be seen in ¼ of the screen and the rest will show the map and the comboboxes. The selected region/regions will be marked with a yellow colour. The time from that the user presses the button Go! to that an answer is shown in the bar chart is a couple of minutes depending on the size of the region(s). The extension Dialog Designer was used to create comboboxes, buttons and text in the view. Figure 9 bar chart button 2 button Go! map comboboxes Page 16(37) 6.3 Connect the application to the new tables in the database The major part of the work with this project was to write scripts in Avenue. A script is a sequence of commands. By connecting different scripts to each other you can make a connection between the application to the new tables in the database. Due to this connection it is possible to find the rate of people who died based on the settings that the user makes. Figure 10 and figure 11 shows in what order the different scripts interact. A summary of what each scripts does can be read in Appendix 5. These scripts runs when view_granser is opened. 1 doc.open Start_noselection 3 2 start 8 fullextent Start_sverige_combobox 4 7 ickesynlig Val av år samt dc 5 Val av tema samt area 1 och 2 nolegend 6 Figure 10 Page 17(37) This tree-diagramm of scripts starts when button Go! is pressed 1 select 2 sammankoppling lstrt sends to con_all_test con_all_test the empty list, lsttom, sends to con_dc con_dc returns the selected undorsakstring in lsttom the empty list, lstyear, sends to year year 6 returns the selected year-string in lstyear barchart Omradesval_1 Runs when the user clicks on the map. Region 1 is selected. Omradesval_2 Runs when the user clicks on the map. Region 2 is selected. Knapp_osynlig Runs when the user presses the 2-button. Figure 11 Page 18(37) 7 A guide through the application This is a small guide through the application. The first thing that appears when the project is opened is the project window. The view-icon indicates that the project only contains one view. This can not be changed. It is not possible to add or remove any views or charts. The reason for this is to protect the application so that the most important (and only) view will not be deleted. Because it is not possible to add views, no users will be confused about which view they should open. After opening the view view_granser this shows: Page 19(37) The combobox Divide Sweden is different from the other comboboxes. Divide Sweden is directly connected to the map. By changing the selected criteria, for example from All of Sweden to County, changes are made immediately, in contrast to the rest of the comboboxes where the changes are made after querying the database. By clicking the down-arrow in the comboboxes the user can choose different settings. In the following example the settings are: Divide Sweden: county Cause of death: diabetes Region 1: Hallands län Region 2: Norrbottens län Year: 1990 The result is shown as a bar chart. The county of Halland can be seen in red and the county of Norrbotten is green. The y-axis always begins at zero and end at three. The selected regions are now yellow. By clicking on the bar chart, a window with the title Identity Results comes up. Identity Results shows the name of the selected region(s) and its exact digit for the rate. Page 20(37) By closing Identity Results a new selection from the database can be made. In the next example the new settings are: Divide Sweden: Parish Cause of death: Malignant tumour in respiratory organ Region 1: Abild Region 2: Jukkasjärvi Year: 1987 Before the user has pressed the button Go!, the result from the last search can still be seen in the bar chart. The result from the settings: Page 21(37) The bar chart shows that no one died in Abild during 1987 of Malignant tumour in respiratory organ. To get the exact digit it is, as before, possible to look at Identity Results by clicking on the bar chart. It is not possible to get a zero-value for Abild in the Identity Result window.
Related Search
Similar documents
View more...
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks