These are some of the datasets used in the Symbolic Regression book

Sample of 100 rows from the Friedman equation. Target variable is y, input variables are x1 to x10. $$f(x)=10\mathrm{sin}(\pi {x}_{1}{x}_{2})+20{\left({x}_{3}-\frac{1}{2}\right)}^{2}+10{x}_{4}+5{x}_{5}$$

Preprocessed Boston housing dataset (taken from: StatLib) Target variable is log(CMEDV) or NOX.

Yacht dataset from Gerritsma, Onnink, and Versluis, "Geometry, Resistance and Stability of The Delft
Systematic Yacht Hull Series, International Shipbuilding Progress, Vol. 28, No. 328, Dec. 1981.

(taken from UCI Machine Learning
Repository)

Predict mu_dyn_avg (dynamic friction coefficient from Exp (categorical), pressure, velocity, and initial
temperature (T_0)).

Predict mu_stat (static friction coefficient from Exp (categorical), pressure, and initial temperature
(T_0)).

(data provided by Miba frictec GmbH)

Preprocessed water level dataset including features for harmonics (original source: NOAA, Station S452634 Elfin Cove, AK).

Preprocessed dataset containing data from four cells (original source: NASA
Ames Research Center)

Citation: B. Saha and K. Goebel (2007). “Battery Data Set”, NASA Prognostics Data Repository, NASA Ames
Research Center, Moffett Field, CA.

Use data from B0005, B0006, B0007 for training and B0018 for testing.
Predict the remaining discharge time from the inital capacity, number of discharge cycles and current voltage.

Preprocessed dataset containing data from multiple cells (original source: NASA
Ames Research Center)

Citation: B. Saha and K. Goebel (2007). “Battery Data Set”, NASA Prognostics Data Repository, NASA Ames
Research Center, Moffett Field, CA.

Data for multiple cells with different discharge currents. Find a model that predicts the remaining discharge
time from the inital capacity, number of discharge cycles, current voltage, and discharge current.

Preprocessed dataset containing data from multiple cells (original source: NASA
Ames Research Center)

Citation: B. Saha and K. Goebel (2007). “Battery Data Set”, NASA Prognostics Data Repository, NASA Ames
Research Center, Moffett Field, CA.

Uses only a single cell (to reduce amount of data).
Find a regression model structure that can be fit to the first part of the voltage curve and predict the
remaining curve until end of discharge.