Appendix 5: Emsi MR-SAM
Emsi’s MR-SAM represents the flow of all economic transactions in a given region. It replaces Emsi’s previous input-output (IO) model, which operated with some 1,000 industries, four layers of government, a single household consumption sector, and an investment sector. The old IO model was used to simulate the ripple effects (i.e., multipliers) in the regional economy as a result of industries entering or exiting the region. The MR-SAM model performs the same tasks as the old IO model, but it also does much more. Along with the same 1,000 industries, government, household and investment sectors embedded in the old IO tool, the MR-SAM exhibits much more functionality, a greater amount of data, and a higher level of detail on the demographic and occupational components of jobs (16 demographic cohorts and about 750 occupations are characterized).
This appendix presents a high-level overview of the MR-SAM. Additional documentation on the technical aspects of the model is available upon request.
Data sources for the model
The Emsi MR-SAM model relies on a number of internal and external data sources, mostly compiled by the federal government. What follows is a listing and short explanation of our sources. The use of these data will be covered in more detail later in this appendix.
Emsi Data are produced from many data sources to produce detailed industry, occupation, and demographic jobs and earnings data at the local level. This information (especially sales-to-jobs ratios derived from jobs and earnings- to-sales ratios) is used to help regionalize the national matrices as well as to disaggregate them into more detailed industries than are normally available.
BEA Make and Use Tables (MUT) are the basis for input-output models in the U.S. The make table is a matrix that describes the amount of each commodity made by each industry in a given year. Industries are placed in the rows and commodities in the columns. The use table is a matrix that describes the amount of each commodity used by each industry in a given year. In the use table, commodities are placed in the rows and industries in the columns. The BEA produces two different sets of MUTs, the benchmark and the summary. The benchmark set contains about 500 sectors and is released every five years, with a five-year lag time (e.g., 2002 benchmark MUTs were released in 2007). The summary set contains about 80 sectors and is released every year, with a two-year lag (e.g., 2010 summary MUTs were released in late 2011/early 2012). The MUTs are used in the Emsi MR-SAM model to produce an industry-by- industry matrix describing all industry purchases from all industries.
BEA Gross Domestic Product by State (GSP) describes gross domestic product from the value added (also known as added income) perspective. Value added is equal to employee compensation, gross operating surplus, and taxes on production and imports, less subsidies. Each of these components is reported for each state and an aggregate group of industries. This dataset is updated once per year, with a one-year lag. The Emsi MR-SAM model makes use of this data as a control and pegs certain pieces of the model to values from this dataset.
BEA National Income and Product Accounts (NIPA) cover a wide variety of economic measures for the nation, including gross domestic product (GDP), sources of output, and distribution of income. This dataset is updated periodically throughout the year and can be between a month and several years old depending on the specific account. NIPA data are used in many of the Emsi MR-SAM processes as both controls and seeds.
BEA Local Area Income (LPI) encapsulates multiple tables with geographies down to the county level. The following two tables are specifically used: CA05 (Personal income and earnings by industry) and CA91 (Gross flow of earnings). CA91 is used when creating the commuting submodel and CA05 is used in several processes to help with place-of-work and place-of-residence differences, as well as to calculate personal income, transfers, dividends, interest, and rent.
Bureau of Labor Statistics Consumer Expenditure Survey (CEX) reports on the buying habits of consumers along with some information as to their income, consumer unit, and demographics. Emsi utilizes this data heavily in the creation of the national demographic by income type consumption on industries.
Census of Government’s (CoG) state and local government finance dataset is used specifically to aid breaking out state and local data that is reported in the MUTs. This allows Emsi to have unique production functions for each of its state and local government sectors.
Census’ OnTheMap (OTM) is a collection of three datasets for the census block level for multiple years. Origin-Destination (OD) offers job totals associated with both home census blocks and a work census block. Residence Area Characteristics (RAC) offers jobs totaled by home census block. Workplace Area Characteristics (WAC) offers jobs totaled by work census block. All three of these are used in the commuting submodel to gain better estimates of earn- ings by industry that may be counted as commuting. This dataset has holes for specific years and regions. These holes are filled with Census’ Journey-to- Work described later.
Census’ Current Population Survey (CPS) is used as the basis for the demo- graphic breakout data of the MR-SAM model. This set is used to estimate the ratios of demographic cohorts and their income for the three different income categories (i.e., wages, property income, and transfers).
Census’ Journey-to-Work (JtW) is part of the 2000 Census and describes the amount of commuting jobs between counties. This set is used to fill in the areas where OTM does not have data.
Census’ American Community Survey (ACS) Public Use Microdata Sample (PUMS) is the replacement for Census’ long form and is used by Emsi to fill the holes in the CPS data.
Oak Ridge National Lab (ORNL) County-to-County Distance Matrix (Skim Tree) contains a matrix of distances and network impedances between each county via various modes of transportation such as highway, railroad, water, and combined highway-rail. Also included in this set are minimum impedances utilizing the best combination of paths. The ORNL distance matrix is used in Emsi’s gravitational flows model that estimates the amount of trade between counties in the country.
Overview of the MR-SAM model
Emsi’s MR-SAM modeling system is a comparative static model in the same general class as RIMS II (Bureau of Economic Analysis) and IMPLAN (Minnesota Implan Group). The MR-SAM model is thus not an econometric model, the primary example of which is PolicyInsight by REMI. It relies on a matrix representation of industry-to-industry purchasing patterns originally based on national data which are regionalized with the use of local data and mathematical manipulation (i.e., non-survey methods). Models of this type estimate the ripple effects of changes in jobs, earnings, or sales in one or more industries upon other industries in a region.
The Emsi MR-SAM model shows final equilibrium impacts—that is, the user enters a change that perturbs the economy and the model shows the changes required to establish a new equilibrium. As such, it is not a dynamic model that shows year-by-year changes over time (as REMI’s does).
Following standard practice, the SAM model appears as a square matrix, with each row sum exactly equaling the corresponding column sum. Reflecting its kinship with the standard Leontief input-output framework, individual SAM elements show accounting flows between row and column sectors during a chosen base year. Read across rows, SAM entries show the flow of funds into column accounts (also known as receipts or the appropriation of funds by those column accounts). Read down columns, SAM entries show the flow of funds into row accounts (also known as expenditures or the dispersal of funds to those row accounts).
MULTI-REGIONAL ASPECT OF THE MR-SAM
Multi-regional (MR) describes a non-survey model that has the ability to analyze the transactions and ripple effects (i.e., multipliers) of not just a single region, but multiple regions interacting with each other. Regions in this case are made up of a collection of counties.
Emsi’s multi-regional model is built off of gravitational flows, assuming that the larger a county’s economy, the more influence it will have on the surrounding counties’ purchases and sales. The equation behind this model is essentially the same that Isaac Newton used to calculate the gravitational pull between planets and stars. In Newton’s equation, the masses of both objects are multiplied, then divided by the distance separating them and multiplied by a constant. In Emsi’s model, the masses are replaced with the supply of a sector for one county and the demand for that same sector from another county. The distance is replaced with an impedance value that takes into account the distance, type of roads, rail lines, and other modes of transportation. Once this is calculated for every county-to-county pair, a set of mathematical operations is performed to make sure all counties absorb the correct amount of supply from every county and the correct amount of demand from every county. These operations produce more than 200 million data points.
Components of the Emsi MR-SAM model
The Emsi MR-SAM is built from a number of different components that are gathered together to display information whenever a user selects a region. What follows is a description of each of these components and how each is created. Emsi’s internally created data are used to a great extent throughout the processes described below, but its creation is not described in this appendix.
COUNTY EARNINGS DISTRIBUTION MATRIX
The county earnings distribution matrices describe the earnings spent by every industry on every occupation for a year—i.e., earnings by occupation. The matrices are built utilizing Emsi’s industry earnings, occupational average earnings, and staffing patterns.
Each matrix starts with a region’s staffing pattern matrix which is multiplied by the industry jobs vector. This produces the number of occupational jobs in each industry for the region. Next, the occupational average hourly earnings per job are multiplied by 2,080 hours, which converts the average hourly earnings into a yearly estimate. Then the matrix of occupational jobs is multiplied by the occupational annual earnings per job, converting it into earnings values. Last, all earnings are adjusted to match the known industry totals. This is a fairly simple process, but one that is very important. These matrices describe the place-of-work earnings used by the MR-SAM.
The commuting sub-model is an integral part of Emsi’s MR-SAM model. It allows the regional and multi-regional models to know what amount of the earnings can be attributed to place-of-residence vs. place-of-work. The commuting data describe the flow of earnings from any county to any other county (including within the counties themselves). For this situation, the commuted earnings are not just a single value describing total earnings flows over a complete year, but are broken out by occupation and demographic. Breaking out the earnings allows for analysis of place-of-residence and place-of-work earnings. These data are created using Bureau of Labor Statistics’ OnTheMap dataset, Census’ Journey-to-Work, BEA’s LPI CA91 and CA05 tables, and some of Emsi’s data. The process incorporates the cleanup and disaggregation of the OnTheMap data, the estimation of a closed system of county inflows and outflows of earnings, and the creation of finalized commuting data.
The national SAM as described above is made up of several different components. Many of the elements discussed are filled in with values from the national Z matrix—or industry-to-industry transaction matrix. This matrix is built from BEA data that describe which industries make and use what commodities at the national level. These data are manipulated with some industry standard equations to produce the national Z matrix. The data in the Z matrix act as the basis for the majority of the data in the national SAM. The rest of the values are filled in with data from the county earnings distribution matrices, the commuting data, and the BEA’s National Income and Product Accounts. One of the major issues that affect any SAM project is the combination of data from multiple sources that may not be consistent with one another. Matrix balancing is the broad name for the techniques used to correct this problem. Emsi uses a modification of the “diagonal similarity scaling” algorithm to balance the national SAM.
GRAVITATIONAL FLOWS MODEL
The most important piece of the Emsi MR-SAM model is the gravitational flows model that produces county-by-county regional purchasing coefficients (RPCs). RPCs estimate how much an industry purchases from other industries inside and outside of the defined region. This information is critical for calculating all IO models.
Gravity modeling starts with the creation of an impedance matrix that values the difficulty of moving a product from county to county. For each sector, an impedance matrix is created based on a set of distance impedance methods for that sector. A distance impedance method is one of the measurements reported in the Oak Ridge National Laboratory’s County-to-County Distance Matrix. In this matrix, every county-to-county relationship is accounted for in six measures: great-circle distance, highway impedance, rail miles, rail impedance, water impedance, and highway-rail-highway impedance. Next, using the impedance information, the trade flows for each industry in every county are solved for. The result is an estimate of multi-regional flows from every county to every county. These flows are divided by each respective county’s demand to produce multi-regional RPCs.