diff --git a/index.html b/index.html index 17d3afcc1..77d8a620a 100644 --- a/index.html +++ b/index.html @@ -1,3558 +1,3558 @@ - - - - - - - - - - - -Report Sample - - - - - - - - - - - - - - - - - - - - - - - -
- -
- -
-
-

Report Sample

-
- - - -
- -
-
Author
-
-

Student name

-
-
- -
-
Published
-
-

August 22, 2023

-
-
- - -
- - -
- -
-

Introduction

-

This is an introduction to Kernel regression, which is a non-parametric estimator that estimates the conditional expectation of two variables which is random. The goal of a kernel regression is to discover the non-linear relationship between two random variables. To discover the non-linear relationship, kernel estimator or kernel smoothing is the main method to estimate the curve for non-parametric statistics. In kernel estimator, weight function is known as kernel function (Efromovich 2008). Cite this paper (Bro and Smilde 2014). The GEE (Wang 2014).

-
-
-

Methods

-

The common non-parametric regression model is \(Y_i = m(X_i) + \varepsilon_i\), where \(Y_i\) can be defined as the sum of the regression function value \(m(x)\) for \(X_i\). Here \(m(x)\) is unknown and \(\varepsilon_i\) some errors. With the help of this definition, we can create the estimation for local averaging i.e. \(m(x)\) can be estimated with the product of \(Y_i\) average and \(X_i\) is near to \(x\). In other words, this means that we are discovering the line through the data points with the help of surrounding data points. The estimation formula is printed below (R Core Team 2019):

-

\[ -M_n(x) = \sum_{i=1}^{n} W_n (X_i) Y_i \tag{1} -\] \(W_n(x)\) is the sum of weights that belongs to all real numbers. Weights are positive numbers and small if \(X_i\) is far from \(x\).

-
-
-

Analysis and Results

-
-

Data and Vizualisation

-

A study was conducted to determine how…

-
-
-Code -
# loading packages 
-library(tidyverse)
-library(knitr)
-library(ggthemes)
-library(ggrepel)
-library(dslabs)
-
-
-
-
-Code -
# Load Data
-kable(head(murders))
-
-
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
stateabbregionpopulationtotal
AlabamaALSouth4779736135
AlaskaAKWest71023119
ArizonaAZWest6392017232
ArkansasARSouth291591893
CaliforniaCAWest372539561257
ColoradoCOWest502919665
-
-
-Code -
ggplot1 = murders %>% ggplot(mapping = aes(x=population/10^6, y=total)) 
-
-  ggplot1 + geom_point(aes(col=region), size = 4) +
-  geom_text_repel(aes(label=abb)) +
-  scale_x_log10() +
-  scale_y_log10() +
-  geom_smooth(formula = "y~x", method=lm,se = F)+
-  xlab("Populations in millions (log10 scale)") + 
-  ylab("Total number of murders (log10 scale)") +
-  ggtitle("US Gun Murders in 2010") +
-  scale_color_discrete(name = "Region")+
-      theme_bw()
-
-
-

-
-
-
-
-

Statistical Modeling

-
-
-

Conlusion

-
-
-
- - -
- -

References

-
-Bro, Rasmus, and Age K Smilde. 2014. “Principal Component Analysis.” Analytical Methods 6 (9): 2812–31. -
-
-Efromovich, S. 2008. Nonparametric Curve Estimation: Methods, Theory, and Applications. Springer Series in Statistics. Springer New York. https://books.google.com/books?id=mdoLBwAAQBAJ. -
-
-R Core Team. 2019. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org. -
-
-Wang, Ming. 2014. “Generalized Estimating Equations in Longitudinal Data Analysis: A Review and Recent Developments.” Advances in Statistics 2014. -
-
- - -
- - - - \ No newline at end of file + + + + + + + + + + + +SVM application in Data Mining + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ +
+
+

Report Sample

+
+ + + +
+ +
+
Author
+
+

Student name

+
+
+ +
+
Published
+
+

August 22, 2023

+
+
+ + +
+ + +
+ +
+

Introduction

+

This is an introduction to Kernel regression, which is a non-parametric estimator that estimates the conditional expectation of two variables which is random. The goal of a kernel regression is to discover the non-linear relationship between two random variables. To discover the non-linear relationship, kernel estimator or kernel smoothing is the main method to estimate the curve for non-parametric statistics. In kernel estimator, weight function is known as kernel function (Efromovich 2008). Cite this paper (Bro and Smilde 2014). The GEE (Wang 2014).

+
+
+

Methods

+

The common non-parametric regression model is \(Y_i = m(X_i) + \varepsilon_i\), where \(Y_i\) can be defined as the sum of the regression function value \(m(x)\) for \(X_i\). Here \(m(x)\) is unknown and \(\varepsilon_i\) some errors. With the help of this definition, we can create the estimation for local averaging i.e. \(m(x)\) can be estimated with the product of \(Y_i\) average and \(X_i\) is near to \(x\). In other words, this means that we are discovering the line through the data points with the help of surrounding data points. The estimation formula is printed below (R Core Team 2019):

+

\[ +M_n(x) = \sum_{i=1}^{n} W_n (X_i) Y_i \tag{1} +\] \(W_n(x)\) is the sum of weights that belongs to all real numbers. Weights are positive numbers and small if \(X_i\) is far from \(x\).

+
+
+

Analysis and Results

+
+

Data and Vizualisation

+

A study was conducted to determine how…

+
+
+Code +
# loading packages 
+library(tidyverse)
+library(knitr)
+library(ggthemes)
+library(ggrepel)
+library(dslabs)
+
+
+
+
+Code +
# Load Data
+kable(head(murders))
+
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
stateabbregionpopulationtotal
AlabamaALSouth4779736135
AlaskaAKWest71023119
ArizonaAZWest6392017232
ArkansasARSouth291591893
CaliforniaCAWest372539561257
ColoradoCOWest502919665
+
+
+Code +
ggplot1 = murders %>% ggplot(mapping = aes(x=population/10^6, y=total)) 
+
+  ggplot1 + geom_point(aes(col=region), size = 4) +
+  geom_text_repel(aes(label=abb)) +
+  scale_x_log10() +
+  scale_y_log10() +
+  geom_smooth(formula = "y~x", method=lm,se = F)+
+  xlab("Populations in millions (log10 scale)") + 
+  ylab("Total number of murders (log10 scale)") +
+  ggtitle("US Gun Murders in 2010") +
+  scale_color_discrete(name = "Region")+
+      theme_bw()
+
+
+

+
+
+
+
+

Statistical Modeling

+
+
+

Conlusion

+
+
+
+ + +
+ +

References

+
+Bro, Rasmus, and Age K Smilde. 2014. “Principal Component Analysis.” Analytical Methods 6 (9): 2812–31. +
+
+Efromovich, S. 2008. Nonparametric Curve Estimation: Methods, Theory, and Applications. Springer Series in Statistics. Springer New York. https://books.google.com/books?id=mdoLBwAAQBAJ. +
+
+R Core Team. 2019. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org. +
+
+Wang, Ming. 2014. “Generalized Estimating Equations in Longitudinal Data Analysis: A Review and Recent Developments.” Advances in Statistics 2014. +
+
+ + +
+ + + + diff --git a/index.qmd b/index.qmd index 1fd3c877f..f4b269547 100644 --- a/index.qmd +++ b/index.qmd @@ -1,6 +1,6 @@ --- -title: "Report Sample" -author: "Student name" +title: "SVM application in Data Mining in EMR" +author: "Brad Lipson" date: '`r Sys.Date()`' format: html: