Once you have created unique identifiers in a similar way in all the datasets, you can proceed to the merging process.Ģ. By using this technique, it is easier to map the different levels and will allow merging in a more efficient and intuitive way. In this way, the data user will intuitively understand that the respondent with HouseholdID 121 is belonging to Region with RegionID 1 and District with DistrictID 12. In this example, the unique respondent identifier is HouseholdID, which is obtained by concatenating higher-level identifiers like RegionID and DistrictID. Table 1 presents an example of how this can be done. In this way, the ultimate unique respondent identifier is built concatenating all other higher-level identifiers. An effective way of achieving this is by creating a system that begins from a larger scale and zooms in on each stratum of the respondent, in order to have a unique way to identify them. We recommend using an intuitive system of identifiers, allowing data users to easily navigate through the data and observations. In order to trace a unique observation, different techniques can be used. or, if you are working with macro data, a district, a region, a country, etc. An observation could be an individual, a household, a school, an enterprise, a health facility, a doctor, a student, etc. Determine an intuitive system of identifiersįor each observation, a unique respondent identifier should allow the user of the dataset to identify the observation. In the sections below, I will provide some tips on the different steps to follow to avoid mistakes and keep all the necessary information.ġ. – append if you want to add observations to a data set. – merge if you want to add variables to an existing dataset, and These two commands will be used according to what information is to be added to your master dataset.
In order to merge datasets at different levels, you can use two commands: merge or append.
#MERGE STATA SOFTWARE#
In this blog, I will provide you with several tips based on the experience of EDI Global on merging datasets at different levels using the statistical software Stata. Your aim is then to create a final dataset with all this information, taking into account different levels of disaggregation. Have you ever been collecting data and needed to create one final dataset for your analysis? However, the data you have is from different levels – a mixture from the national level, regional level, household level, or village level.