Saturday, 18 January 2020

Employment and Population Statistics in Singapore: Practising existing skills on an interesting topic.

This will be a long post, to ensure that the results of this analysis are reproducible.

It has been a while since my last blog post. Wanted to write about a happy trip to Taiwan but have not been in the mood to do so. People who know me well know that I am an advocate for real value add. People who don't know me judge based on malicious rumours that I am well aware of. This is an article based on the recent public interest in employment and population statistics in Singapore (read about the first instance here). Please note that the presented data is based on what I can dig from SingStat in reasonable time, and some results were derived based on estimates, and may not be as precise as I would like it to be. I took it as a practice session to revive the skills previously derived from looking at scientific data in many dimensions (thanks to my training in combustion simulations as early as 2004). Would like raw data as much as possible, as that is my usual way of looking at data (unavailable). Well, we work with what we have. Not going to take sides in this post, so please don't go around spreading rumours like some people did. Hope I don't get POFMAed.

A scientist (me) asks questions, forms hypotheses and does experiments to answer the questions. For this post, I shall ask some questions and present how they can be answered. There is no right or wrong way to do this, as the data available is not broken down into fine details. This is like literature review with some analyses based on available data. The ideal scenario is to have a few million data points, with relevant dimensions such as nationality, occupation, employment status, provided in the time dimension. Kind of like combustion simulations (you can read about it from a well known paper here), where data is in space, time and many other dimensions. Why do I do this? Because I don't trust everything I read. Also, I hope to produce some useful data for others to discuss the topic in a more constructive manner.

Data will be presented from the year 2000 onwards, due to lack of historical data. This is to show general trends over a period of time that is as long as possible. The recent efforts by our Government to enable data access is great, as it can be automated.

First question: How to separate population numbers for Singaporeans and PRs, and derive number of new citizens every year? Go to SingStat site, and look for data. To count the number of residents for every year, search for 'Residents By Age Group & Type Of Dwelling, Annual', then collect data for each age group and add up. For number of Singaporeans every year, look for 'Singapore Citizens By Age Group, Ethnic Group And Sex, End June, Annual', then find 'Total Citizen'. For birth rates, 'Births And Fertility Rates, Annual', then 'Resident Live Births'. For death rates, 'Death And Death Rates, Annual', then 'Resident Deaths'. Take the resident/citizen numbers from the previous year, add the birth rates and deduct the birth rates for that year, then update the trend for the current year. That will form the expected population if we do not have immigration/emigration. As birth and death rates were available only for residents, the end result can only be an estimate. Here is the result.

Resident and Citizen population. Data for trends without immigration derived using resident birth and death data.
Estimated increase in resident and citizen population since 2000.
The data shows a slowdown in increase of residents since 2010 (line almost parallel to that of citizen data). The low growth in PRs is possibly due to elections in 2011. The estimated data for expected population growth for residents and citizens look similar because the segregated data was unavailable. These estimates show approximately a 10,000 increase in citizen population per year from 2010 to 2018 due to net immigration (immigration minus emigration).

Next question: How many are unemployed in Singapore, and out of those who are unemployed, how many are residents? Search for 'Labour Force, Aged 15 Years And Over, (June), Annual', followed by 'total labour', to get overall statistics on unemployment. For resident data, look for 'Employed Residents Aged 15 Years And Over By Occupation And Age Group, (June), Annual', followed by 'Total Employed Residents'. For PMET data, three groups were included ('Professionals', 'Legislators, Senior Officials & Managers'). There are ways to tally the data, to ensure that numbers match. Some roundoff errors were due to the unemployment rate being presented to 2 s.f., so I had a look at it. If you take total unemployment, and compared with total labour force multiplied by unemployment rate, you can see some discrepencies, as shown below.

Comparing different methods to calculate unemployment numbers, to assess roundoff errors
 To check if reliable data could be produced using a low resolution unemployment rate, points above (overestimate) and below (underestimate) the ideal line were counted in the period 2000-2018. As long as these two numbers are almost equal, it is fine, as the data size is too small to draw any meaningful conclusion about the roundoff errors. These data points are compared against resident unemployment, as shown below.

Total and resident unemployment numbers
Total unemployed resident numbers were calculated using total resident workforce multiplied by resident unemployment rate. You should expect the total unemployed residents to be always less than or equal to total unemployed. It can be observed that the two methods used to calculate total unemployment produced results that were close enough. Interesting to note that the trends for total and resident unemployment were almost identical.

Next question: What are the trends for each segment of the workforce? Using the employment data as derived before, it was possible to answer this question. A way to verify the data is to use CPF data. Search for 'Active Central Provident Fund Members By Age Group (End Of Period), Annual', followed by 'Total Active CPF'. This would identify active CPF members, which was those with positive contribution to their CPF accounts. Results are shown as follows:

Trends in workforce
The total labour force is much higher than the resident workforce. This is probably because of the construction, manufacturing and service related sectors that are labour intensive. As citizens and PRs have to make CPF contributions, the number of active CPF members serve as a form of check for data quality. By deducting the number of taxi drivers, Grab drivers and Grab food delivery staff (data from 2018-2019), the resultant data was very close to that of active CPF members. Possibly because stint based workers do not have to contribute to CPF (otherwise they may starve). A way to assess the job creation is to see a faster growth in resident PMETs compared to other sectors within the resident workforce. Would be good if the data can be segregated further to compare Singaporeans with PRs.

Next: Convert some of the data to see the changes between consecutive years. This is shown below:

Annual changes in population and employment data, benchmarked with unemployment
There is a gradual decrease in the resident workforce that did not make positive contributions to their CPF, possibly due to compulsory contributions to Medisave in some professions. I remember having to do so while I was a student. The number of unemployed residents has remained stagnant over the years. Would be good to reduce this number as much as possible. The number of PRs has remained constant over the past decade, with around 10,000 net growth in citizens via migration. The change in employed residents should always be more than the net influx. A better scenario would be to have a higher change in resident PMETs, which has been positive so far. This is quite a good scenario for residents. However, it would be better to know the results for Singaporeans. For the resident workforce that are non active in their CPF accounts, it would be good to create jobs for them to switch to stable jobs where employers are required to contribute to their CPF accounts. This will prevent social problems down the road. We should give opportunities to Singaporeans to ensure that they can support themselves in old age. Comparing the change in employed residents with the 60,000 new jobs created as described in this article, it can be seen that the numbers here are pretty close. Interesting to note that the increase in resident PMETs from 2015-2018 is around 87,000, more than 60,000, indicating a net push in jobs towards PMET jobs.

Next: to show the cumulative effects on employment and population since 2000. To get trends using a later benchmark year, just set the point in that particular year to zero. Here are the results:

Overall trends in employment and population since 2000
Since 2000, the increase in number of jobs for residents is more than 700,000. PMET jobs increased by around 600,000 for residents. To measure the quality of jobs created, the line for PMET should converge with that for all jobs. In the past decade, this gap has widened, so perhaps more PMET jobs should be given to residents. The change in PRs and change due to net immigration is lower than change in employment in general, so not a bad thing after all.

Next question: are new citizens and PRs stealing the lunch? Hope I don't get into trouble for this. The change in number of jobs for original Singaporeans was derived by deducting the change in PRs and citizens due to net immigration from the change in number of jobs. Here it is (estimated):

Annual change in resident PMETs and estimated number of jobs for "original" Singaporeans
The trends for change in resident PMETs and change in number of jobs for original Singaporeans is similar. The estimated number of jobs that go to original Singaporeans fluctuates around zero, which may be alright, as we have an ageing population. Would be great if I can access data that is more detailed, in order to form better conclusions about the big picture. Personally, I would like to have a good PMET job as well, although that may be impossible in Singapore.

Please feel free to comment, contribute, debate, criticize etc. so that I can learn from this. However, do not attempt to misrepresent the data available here. Some are estimates based on missing data, so the data may not be precise enough. The graphs have to be presented with the relevant formulas and data sources.