Spaces:
Running
Running
MilesCranmer
commited on
Commit
•
be5629a
1
Parent(s):
fa629f3
Add prime number example
Browse files- docs/examples.md +109 -1
docs/examples.md
CHANGED
@@ -173,7 +173,115 @@ print(model)
|
|
173 |
|
174 |
If all goes well, you should find that it predicts the correct input equation, without the noise term!
|
175 |
|
176 |
-
## 7.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
177 |
|
178 |
For the many other features available in PySR, please
|
179 |
read the [Options section](options.md).
|
|
|
173 |
|
174 |
If all goes well, you should find that it predicts the correct input equation, without the noise term!
|
175 |
|
176 |
+
## 7. Julia packages and types
|
177 |
+
|
178 |
+
PySR uses [SymbolicRegression.jl](https://github.com/MilesCranmer/SymbolicRegression.jl)
|
179 |
+
as its search backend. This is a pure Julia package, and so can interface easily with any other
|
180 |
+
Julia package.
|
181 |
+
For some tasks, it may be necessary to load such a package.
|
182 |
+
|
183 |
+
For example, let's consider an example where we wish to find the following relationship:
|
184 |
+
|
185 |
+
$$ y = p_{3x + 1} - 5, $$
|
186 |
+
|
187 |
+
where $p_i$ is the $i$th prime number, and $x$ is the input feature.
|
188 |
+
|
189 |
+
Let's see if we can discover this relationship between $x$ and $y$, using
|
190 |
+
the [Primes.jl](https://github.com/JuliaMath/Primes.jl) package.
|
191 |
+
|
192 |
+
First, let's manually initialize the Julia backend (here, with 8 threads)
|
193 |
+
|
194 |
+
```python
|
195 |
+
import pysr
|
196 |
+
jl = pysr.julia_helpers.init_julia(julia_kwargs={"threads": 8})
|
197 |
+
```
|
198 |
+
`jl` is the Julia runtime.
|
199 |
+
|
200 |
+
Now, let's run some Julia code to add the Primes.jl
|
201 |
+
package to the PySR environment:
|
202 |
+
|
203 |
+
```python
|
204 |
+
jl.eval("""
|
205 |
+
import Pkg
|
206 |
+
Pkg.add("Primes")
|
207 |
+
""")
|
208 |
+
```
|
209 |
+
|
210 |
+
This imports the Julia package manager, and uses it to install
|
211 |
+
`Primes.jl`. Now let's import `Primes.jl`:
|
212 |
+
|
213 |
+
Now, let's import it
|
214 |
+
|
215 |
+
```python
|
216 |
+
jl.eval("import Primes")
|
217 |
+
```
|
218 |
+
|
219 |
+
Now, let's define a custom operator. We can then pass this
|
220 |
+
to PySR later on.
|
221 |
+
|
222 |
+
```python
|
223 |
+
jl.eval("""
|
224 |
+
function p(i::T) where T
|
225 |
+
if (0.5 < i < 1000)
|
226 |
+
return T(Primes.prime(round(Int, i)))
|
227 |
+
else
|
228 |
+
return T(NaN)
|
229 |
+
end
|
230 |
+
end
|
231 |
+
""")
|
232 |
+
```
|
233 |
+
|
234 |
+
We have created a custom operator `p`, which takes an arbitrary number as input.
|
235 |
+
It then checks whether the input is between 0.5 and 1000.
|
236 |
+
If out-of-bounds, it returns `NaN`.
|
237 |
+
If in-bounds, it rounds it to the nearest integer, and returns the corresponding prime number, mapped to the same type as input.
|
238 |
+
|
239 |
+
Now, let's generate some test data, using the first 100 primes.
|
240 |
+
Since we are using PyJulia, we can pass data back and forth
|
241 |
+
to our custom Julia operator:
|
242 |
+
|
243 |
+
```python
|
244 |
+
primes = {i: jl.p(i*1.0) for i in range(1, 999)}
|
245 |
+
```
|
246 |
+
|
247 |
+
And let's create a dataset:
|
248 |
+
|
249 |
+
```python
|
250 |
+
X = np.random.randint(0, 100, 100)[:, None]
|
251 |
+
y = [primes[3*X[i, 0] + 1] - 5 for i in range(100)]
|
252 |
+
```
|
253 |
+
|
254 |
+
Finally, let's create a PySR model, and pass the custom operator. We also need to define the sympy equivalent, which we can leave as a placeholder for now:
|
255 |
+
|
256 |
+
```python
|
257 |
+
from pysr import PySRRegressor
|
258 |
+
import sympy
|
259 |
+
|
260 |
+
class sympy_p(sympy.Function):
|
261 |
+
pass
|
262 |
+
|
263 |
+
model = PySRRegressor(
|
264 |
+
binary_operators=["+", "-", "*", "/"],
|
265 |
+
unary_operators=["p"],
|
266 |
+
niterations=1000,
|
267 |
+
extra_sympy_mappings={"p": sympy_p}
|
268 |
+
)
|
269 |
+
```
|
270 |
+
|
271 |
+
We are all set to go! Let's see if we can find the true relation:
|
272 |
+
|
273 |
+
```python
|
274 |
+
model.fit(X, y)
|
275 |
+
```
|
276 |
+
|
277 |
+
if all works out, you should be able to see the true relation (note that the constant offset might not be exactly 1, since it is allowed to round to the nearest integer).
|
278 |
+
You can get the sympy version of the last row with:
|
279 |
+
|
280 |
+
```python
|
281 |
+
model.sympy(index=-1)
|
282 |
+
```
|
283 |
+
|
284 |
+
## 8. Additional features
|
285 |
|
286 |
For the many other features available in PySR, please
|
287 |
read the [Options section](options.md).
|