MilesCranmer commited on
Commit
be5629a
1 Parent(s): fa629f3

Add prime number example

Browse files
Files changed (1) hide show
  1. docs/examples.md +109 -1
docs/examples.md CHANGED
@@ -173,7 +173,115 @@ print(model)
173
 
174
  If all goes well, you should find that it predicts the correct input equation, without the noise term!
175
 
176
- ## 7. Additional features
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
177
 
178
  For the many other features available in PySR, please
179
  read the [Options section](options.md).
 
173
 
174
  If all goes well, you should find that it predicts the correct input equation, without the noise term!
175
 
176
+ ## 7. Julia packages and types
177
+
178
+ PySR uses [SymbolicRegression.jl](https://github.com/MilesCranmer/SymbolicRegression.jl)
179
+ as its search backend. This is a pure Julia package, and so can interface easily with any other
180
+ Julia package.
181
+ For some tasks, it may be necessary to load such a package.
182
+
183
+ For example, let's consider an example where we wish to find the following relationship:
184
+
185
+ $$ y = p_{3x + 1} - 5, $$
186
+
187
+ where $p_i$ is the $i$th prime number, and $x$ is the input feature.
188
+
189
+ Let's see if we can discover this relationship between $x$ and $y$, using
190
+ the [Primes.jl](https://github.com/JuliaMath/Primes.jl) package.
191
+
192
+ First, let's manually initialize the Julia backend (here, with 8 threads)
193
+
194
+ ```python
195
+ import pysr
196
+ jl = pysr.julia_helpers.init_julia(julia_kwargs={"threads": 8})
197
+ ```
198
+ `jl` is the Julia runtime.
199
+
200
+ Now, let's run some Julia code to add the Primes.jl
201
+ package to the PySR environment:
202
+
203
+ ```python
204
+ jl.eval("""
205
+ import Pkg
206
+ Pkg.add("Primes")
207
+ """)
208
+ ```
209
+
210
+ This imports the Julia package manager, and uses it to install
211
+ `Primes.jl`. Now let's import `Primes.jl`:
212
+
213
+ Now, let's import it
214
+
215
+ ```python
216
+ jl.eval("import Primes")
217
+ ```
218
+
219
+ Now, let's define a custom operator. We can then pass this
220
+ to PySR later on.
221
+
222
+ ```python
223
+ jl.eval("""
224
+ function p(i::T) where T
225
+ if (0.5 < i < 1000)
226
+ return T(Primes.prime(round(Int, i)))
227
+ else
228
+ return T(NaN)
229
+ end
230
+ end
231
+ """)
232
+ ```
233
+
234
+ We have created a custom operator `p`, which takes an arbitrary number as input.
235
+ It then checks whether the input is between 0.5 and 1000.
236
+ If out-of-bounds, it returns `NaN`.
237
+ If in-bounds, it rounds it to the nearest integer, and returns the corresponding prime number, mapped to the same type as input.
238
+
239
+ Now, let's generate some test data, using the first 100 primes.
240
+ Since we are using PyJulia, we can pass data back and forth
241
+ to our custom Julia operator:
242
+
243
+ ```python
244
+ primes = {i: jl.p(i*1.0) for i in range(1, 999)}
245
+ ```
246
+
247
+ And let's create a dataset:
248
+
249
+ ```python
250
+ X = np.random.randint(0, 100, 100)[:, None]
251
+ y = [primes[3*X[i, 0] + 1] - 5 for i in range(100)]
252
+ ```
253
+
254
+ Finally, let's create a PySR model, and pass the custom operator. We also need to define the sympy equivalent, which we can leave as a placeholder for now:
255
+
256
+ ```python
257
+ from pysr import PySRRegressor
258
+ import sympy
259
+
260
+ class sympy_p(sympy.Function):
261
+ pass
262
+
263
+ model = PySRRegressor(
264
+ binary_operators=["+", "-", "*", "/"],
265
+ unary_operators=["p"],
266
+ niterations=1000,
267
+ extra_sympy_mappings={"p": sympy_p}
268
+ )
269
+ ```
270
+
271
+ We are all set to go! Let's see if we can find the true relation:
272
+
273
+ ```python
274
+ model.fit(X, y)
275
+ ```
276
+
277
+ if all works out, you should be able to see the true relation (note that the constant offset might not be exactly 1, since it is allowed to round to the nearest integer).
278
+ You can get the sympy version of the last row with:
279
+
280
+ ```python
281
+ model.sympy(index=-1)
282
+ ```
283
+
284
+ ## 8. Additional features
285
 
286
  For the many other features available in PySR, please
287
  read the [Options section](options.md).