Content modified under Creative Commons Attribution license CC-BY 4.0, code under BSD 3-Clause License © 2020 R.C. Cooper
Homework#
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
plt.style.use('fivethirtyeight')
Problems Part 1#
Gordon Moore created an empirical prediction that the rate of semiconductors on a computer chip would double every two years. This prediction was known as Moore’s law. Gordon Moore had originally only expected this empirical relation to hold from 1965 - 1975 [1,2], but semiconductor manufacturers were able to keep up with Moore’s law until 2015.
In the folder “…/data” is a comma separated value (CSV) file, “transistor_data.csv” taken from wikipedia 01/2020.
a. Use the !head ../data/transistor_data.csv
command to look at
the top of the csv. What are the headings for the columns?
b. Load the csv into a pandas dataframe. How many missing values
(NaN
) are
in the column with the number of transistors? What fraction are
missing?
Problems Part 2#
Many beers do not report the IBU of the beer because it is very small. You may be accidentally removing whole categories of beer from our dataset by removing rows that do not include the IBU measure.
a. Use the command
beers_filled = beers.fillna(0)
to clean thebeers
dataframeb. Recreate the plot “Beer ABV vs. IBU mean values by style” bubble plot with
beers_filled
. What differences do you notice between the plots?
Gordon Moore created an empirical prediction that the rate of semiconductors on a computer chip would double every two years. This prediction was known as Moore’s law. Gordon Moore had originally only expected this empirical relation to hold from 1965 - 1975 [1,2], but semiconductor manufacturers were able to keep up with Moore’s law until 2015.
In the folder “…/data” is a comma separated value (CSV) file, “transistor_data.csv” taken from wikipedia 01/2020. Load the csv into a pandas dataframe, it has the following headings:
Processor
MOS transistor count
Date of Introduction
Designer
MOSprocess
Area
a. In the years 2017, what was the average MOS transistor count? Make a boxplot of the transistor count in 2017 and find the first, second and third quartiles.
b. Create a semilog y-axis scatter plot (i.e.
plt.semilogy
) for the “Date of Introduction” vs “MOS transistor count”. Color the data according to the “Designer”.
Problems Part 3#
There is a csv file in ‘…/data/primary-energy-consumption-by-region.csv’ that has the energy consumption of different regions of the world from 1965 until 2018 Our world in Data. Compare the energy consumption of the United States to all of Europe. Load the data into a pandas dataframe. Note: you can get certain rows of the data frame by specifying what you’re looking for e.g.
EUR = dataframe[dataframe['Entity']=='Europe']
will give us all the rows from Europe’s energy consumption.a. Plot the total energy consumption of the United States and Europe
b. Use a linear least-squares regression to find a function for the energy consumption as a function of year
energy consumed = \(f(t) = At+B\)
c. At what year would you change split the data and use two lines like you did in the land temperature anomoly? Split the data and perform two linear fits.
d. What is your prediction for US energy use in 2025? How about European energy use in 2025?
energy = pd.read_csv('../data/primary-energy-consumption-by-region.csv')
You plotted Gordon Moore’s empirical prediction that the rate of semiconductors on a computer chip would double every two years in 02_Seeing_Stats. This prediction was known as Moore’s law. Gordon Moore had originally only expected this empirical relation to hold from 1965 - 1975 [1,2], but semiconductor manufacturers were able to keep up with Moore’s law until 2015.
Use a linear regression to find your own historical Moore’s Law.
Use code from 02_Seeing_Stats to plot the semilog y-axis scatter plot
(i.e. plt.semilogy
) for the “Date of Introduction” vs “MOS transistor count”.
Color the data according to the “Designer”.
Create a linear regression for the data in the form of
\(log(transistor~count)= f(date) = A\cdot date+B\)
rearranging
\(transistor~count= e^{f(date)} = e^B e^{A\cdot date}\)
You can perform a least-squares linear regression using the following assignments
\(x_i=\) dataframe['Date of Introduction'].values
and
\(y_i=\) as np.log(dataframe['MOS transistor count'].values)
a. Plot your function on the semilog y-axis scatter plot
b. What are the values of constants \(A\) and \(B\) for our Moore’s law fit? How does this compare to Gordon Moore’s prediction that MOS transistor count doubles every two years?
data = pd.read_csv('../data/transistor_data.csv')
data = data.dropna()
xi=data['Date of Introduction'].values
TC=data['MOS transistor count'].values
Problems Part 4#
1. Buffon’s needle problem is another way to estimate the value of \(\pi\) with random numbers. The goal in this Monte Carlo estimate of \(\pi\) is to create a ratio that is close to 3.1415926… similar to the example with darts points lying inside/outside a unit circle inside a unit square.
In this Monte Carlo estimation, you only need to know two values:
the distance from line 0, \(x = [0,~1]\)
the orientation of the needle, \(\theta = [0,~2\pi]\)
The y-location does not affect the outcome of crosses line 0 or not crossing line 0.
a. Generate 100 random x
and theta
values remember \(\theta =
[0,~2\pi]\)
b. Calculate the x locations of the 100 needle ends e.g. \(x_end = x \pm \cos\theta\) _since length is unit 1.
c. Use
np.logical_and
to find the number of needles that have minimum \(x_{end~min}<0\) and
maximum \(x_{end~max}>0\). The ratio
\(\frac{x_{end~min}<0~and~x_{end~max}>0}{number~of~needles} =
\frac{2}{\pi}\) for large values of \(number~of~needles\).
2. Build a random walk data set with steps between \(dx = dy = -1/2~to~1/2~m\). If 100 particles take 10 steps, calculate the number of particles that move further than 0.5 m.
Bonus: Can you do the work without any for
-loops? Change the size of
dx
and dy
to account for multiple particles.
3. 100 steel rods are going to be used to support a 1000 kg structure. The rods will buckle when the load in any rod exceeds the critical buckling load
\(P_{cr}=\frac{\pi^3 Er^4}{16L^2}\)
where E=200e9 Pa, r=0.01 m +/-0.001 m, and L is the
length of the rods supporting the structure. Create a Monte
Carlo model montecarlo_buckle
that predicts
the mean and standard deviation of the buckling load for 100
samples with normally distributed dimensions r and L.
mean_buckle_load,std_buckle_load=\
montecarlo_buckle(E,r_mean,r_std,L,N=100)
a. What is the mean_buckle_load and std_buckle_load for L=5 m?
b. What length, L, should the beams be so that only 2.5% will reach the critical buckling load?
def montecarlo_buckle(E,r_mean,r_std,L,N=100):
'''Generate N rods of length L with radii of r=r_mean+/-r_std
then calculate the mean and std of the buckling loads in for the
rod population holding a 1000-kg structure
Arguments
---------
E: Young's modulus [note: keep units consistent]
r_mean: mean radius of the N rods holding the structure
r_std: standard deviation of the N rods holding the structure
L: length of the rods (or the height of the structure)
N: number of rods holding the structure, default is N=100 rods
Returns
-------
mean_buckle_load: mean buckling load of N rods under 1000*9.81/N-Newton load
std_buckle_load: std dev buckling load of N rods under 1000*9.81/N-Newton load
'''
return mean_buckle_load, std_buckle_load