Spaces:

MilesCranmer
/

PySR

Running

App Files Files Community

MilesCranmer commited on Feb 10, 2024

Commit

87880d1

unverified ·

1 Parent(s): 8685680

Deprecate ncyclesperiteration -> ncycles_per_iteration

Browse files

Files changed (9) hide show

README.md +1 -1
docs/options.md +2 -2
docs/tuning.md +2 -2
pysr/deprecated.py +1 -0
pysr/param_groupings.yml +1 -1
pysr/sr.py +5 -5
pysr/test/params.py +1 -1
pysr/test/test.py +2 -2
pysr/test/test_warm_start.py +1 -1

README.md CHANGED Viewed

@@ -297,7 +297,7 @@ model = PySRRegressor(
     # ^ 2 populations per core, so one is always running.
     population_size=50,
     # ^ Slightly larger populations, for greater diversity.
-    ncyclesperiteration=500,
     # ^ Generations between migrations.
     niterations=10000000,  # Run forever
     early_stop_condition=(

     # ^ 2 populations per core, so one is always running.
     population_size=50,
     # ^ Slightly larger populations, for greater diversity.
+    ncycles_per_iteration=500,
     # ^ Generations between migrations.
     niterations=10000000,  # Run forever
     early_stop_condition=(

docs/options.md CHANGED Viewed

@@ -78,11 +78,11 @@ with the equations.
 Each cycle considers every 10-equation subsample (re-sampled for each individual 10,
 unless `fast_cycle` is set in which case the subsamples are separate groups of equations)
 a single time, producing one mutated equation for each.
-The parameter `ncyclesperiteration` defines how many times this
 occurs before the equations are compared to the hall of fame,
 and new equations are migrated from the hall of fame, or from other populations.
 It also controls how slowly annealing occurs. You may find that increasing
-`ncyclesperiteration` results in a higher cycles-per-second, as the head
 worker needs to reduce and distribute new equations less often, and also increases
 diversity. But at the same
 time, a smaller number it might be that migrating equations from the hall of fame helps

 Each cycle considers every 10-equation subsample (re-sampled for each individual 10,
 unless `fast_cycle` is set in which case the subsamples are separate groups of equations)
 a single time, producing one mutated equation for each.
+The parameter `ncycles_per_iteration` defines how many times this
 occurs before the equations are compared to the hall of fame,
 and new equations are migrated from the hall of fame, or from other populations.
 It also controls how slowly annealing occurs. You may find that increasing
+`ncycles_per_iteration` results in a higher cycles-per-second, as the head
 worker needs to reduce and distribute new equations less often, and also increases
 diversity. But at the same
 time, a smaller number it might be that migrating equations from the hall of fame helps

docs/tuning.md CHANGED Viewed

@@ -14,12 +14,12 @@ I run from IPython (Jupyter Notebooks don't work as well[^1]) on the head node o
 2. Use only the operators I think it needs and no more.
 3. Increase `populations` to `3*num_cores`.
 4. If my dataset is more than 1000 points, I either subsample it (low-dimensional and not much noise) or set `batching=True` (high-dimensional or very noisy, so it needs to evaluate on all the data).
-5. While on a laptop or single node machine, you might leave the default `ncyclesperiteration`, on a cluster with ~100 cores I like to set `ncyclesperiteration` to maybe `5000` or so, until the head node occupation is under `10%`. (A larger value means the workers talk less frequently to eachother, which is useful when you have many workers!)
 6. Set `constraints` and `nested_constraints` as strict as possible. These can help quite a bit with exploration. Typically, if I am using `pow`, I would set `constraints={"pow": (9, 1)}`, so that power laws can only have a variable or constant as their exponent. If I am using `sin` and `cos`, I also like to set `nested_constraints={"sin": {"sin": 0, "cos": 0}, "cos": {"sin": 0, "cos": 0}}`, so that sin and cos can't be nested, which seems to happen frequently. (Although in practice I would just use `sin`, since the search could always add a phase offset!)
 7. Set `maxsize` a bit larger than the final size you want. e.g., if you want a final equation of size `30`, you might set this to `35`, so that it has a bit of room to explore.
 8. I typically don't use `maxdepth`, but if I do, I set it strictly, while also leaving a bit of room for exploration. e.g., if you want a final equation limited to a depth of `5`, you might set this to `6` or `7`, so that it has a bit of room to explore.
 9.  Set `parsimony` equal to about the minimum loss you would expect, divided by 5-10. e.g., if you expect the final equation to have a loss of `0.001`, you might set `parsimony=0.0001`.
-10. Set `weight_optimize` to some larger value, maybe `0.001`. This is very important if `ncyclesperiteration` is large, so that optimization happens more frequently.
 11. Set `turbo` to `True`. This may or not work, if there's an error just turn it off (some operators are not SIMD-capable). If it does work, it should give you a nice 20% speedup.
 12. For final runs, after I have tuned everything, I typically set `niterations` to some very large value, and just let it run for a week until my job finishes (genetic algorithms tend not to converge, they can look like they settle down, but then find a new family of expression, and explore a new space). If I am satisfied with the current equations (which are visible either in the terminal or in the saved csv file), I quit the job early.

 2. Use only the operators I think it needs and no more.
 3. Increase `populations` to `3*num_cores`.
 4. If my dataset is more than 1000 points, I either subsample it (low-dimensional and not much noise) or set `batching=True` (high-dimensional or very noisy, so it needs to evaluate on all the data).
+5. While on a laptop or single node machine, you might leave the default `ncycles_per_iteration`, on a cluster with ~100 cores I like to set `ncycles_per_iteration` to maybe `5000` or so, until the head node occupation is under `10%`. (A larger value means the workers talk less frequently to eachother, which is useful when you have many workers!)
 6. Set `constraints` and `nested_constraints` as strict as possible. These can help quite a bit with exploration. Typically, if I am using `pow`, I would set `constraints={"pow": (9, 1)}`, so that power laws can only have a variable or constant as their exponent. If I am using `sin` and `cos`, I also like to set `nested_constraints={"sin": {"sin": 0, "cos": 0}, "cos": {"sin": 0, "cos": 0}}`, so that sin and cos can't be nested, which seems to happen frequently. (Although in practice I would just use `sin`, since the search could always add a phase offset!)
 7. Set `maxsize` a bit larger than the final size you want. e.g., if you want a final equation of size `30`, you might set this to `35`, so that it has a bit of room to explore.
 8. I typically don't use `maxdepth`, but if I do, I set it strictly, while also leaving a bit of room for exploration. e.g., if you want a final equation limited to a depth of `5`, you might set this to `6` or `7`, so that it has a bit of room to explore.
 9.  Set `parsimony` equal to about the minimum loss you would expect, divided by 5-10. e.g., if you expect the final equation to have a loss of `0.001`, you might set `parsimony=0.0001`.
+10. Set `weight_optimize` to some larger value, maybe `0.001`. This is very important if `ncycles_per_iteration` is large, so that optimization happens more frequently.
 11. Set `turbo` to `True`. This may or not work, if there's an error just turn it off (some operators are not SIMD-capable). If it does work, it should give you a nice 20% speedup.
 12. For final runs, after I have tuned everything, I typically set `niterations` to some very large value, and just let it run for a week until my job finishes (genetic algorithms tend not to converge, they can look like they settle down, but then find a new family of expression, and explore a new space). If I am satisfied with the current equations (which are visible either in the terminal or in the saved csv file), I quit the job early.

pysr/deprecated.py CHANGED Viewed

@@ -79,6 +79,7 @@ def make_deprecated_kwargs_for_pysr_regressor():
         warmupMaxsizeBy => warmup_maxsize_by
         useFrequency => use_frequency
         useFrequencyInTournament => use_frequency_in_tournament
     """
     # Turn this into a dict:
     deprecated_kwargs = {}

         warmupMaxsizeBy => warmup_maxsize_by
         useFrequency => use_frequency
         useFrequencyInTournament => use_frequency_in_tournament
+        ncyclesperiteration => ncycles_per_iteration
     """
     # Turn this into a dict:
     deprecated_kwargs = {}

pysr/param_groupings.yml CHANGED Viewed

@@ -8,7 +8,7 @@
     - niterations
     - populations
     - population_size
-    - ncyclesperiteration
   - The Objective:
     - loss
     - full_objective

     - niterations
     - populations
     - population_size
+    - ncycles_per_iteration
   - The Objective:
     - loss
     - full_objective

pysr/sr.py CHANGED Viewed

@@ -354,7 +354,7 @@ class PySRRegressor(MultiOutputMixin, RegressorMixin, BaseEstimator):
         takes a loss and complexity as input, for example:
         `"f(loss, complexity) = (loss < 0.1) && (complexity < 10)"`.
         Default is `None`.
-    ncyclesperiteration : int
         Number of total mutations to run, per 10 samples of the
         population, per iteration.
         Default is `550`.
@@ -398,7 +398,7 @@ class PySRRegressor(MultiOutputMixin, RegressorMixin, BaseEstimator):
         Constant optimization can also be performed as a mutation, in addition to
         the normal strategy controlled by `optimize_probability` which happens
         every iteration. Using it as a mutation is useful if you want to use
-        a large `ncyclesperiteration`, and may not optimize very often.
         Default is `0.0`.
     crossover_probability : float
         Absolute probability of crossover-type genetic operation, instead of a mutation.
@@ -688,7 +688,7 @@ class PySRRegressor(MultiOutputMixin, RegressorMixin, BaseEstimator):
         alpha: float = 0.1,
         annealing: bool = False,
         early_stop_condition: Optional[Union[float, str]] = None,
-        ncyclesperiteration: int = 550,
         fraction_replaced: float = 0.000364,
         fraction_replaced_hof: float = 0.035,
         weight_add_node: float = 0.79,
@@ -756,7 +756,7 @@ class PySRRegressor(MultiOutputMixin, RegressorMixin, BaseEstimator):
         self.niterations = niterations
         self.populations = populations
         self.population_size = population_size
-        self.ncyclesperiteration = ncyclesperiteration
         # - Equation Constraints
         self.maxsize = maxsize
         self.maxdepth = maxdepth
@@ -1652,7 +1652,7 @@ class PySRRegressor(MultiOutputMixin, RegressorMixin, BaseEstimator):
             use_frequency_in_tournament=self.use_frequency_in_tournament,
             adaptive_parsimony_scaling=self.adaptive_parsimony_scaling,
             npop=self.population_size,
-            ncycles_per_iteration=self.ncyclesperiteration,
             fraction_replaced=self.fraction_replaced,
             topn=self.topn,
             print_precision=self.print_precision,

         takes a loss and complexity as input, for example:
         `"f(loss, complexity) = (loss < 0.1) && (complexity < 10)"`.
         Default is `None`.
+    ncycles_per_iteration : int
         Number of total mutations to run, per 10 samples of the
         population, per iteration.
         Default is `550`.
         Constant optimization can also be performed as a mutation, in addition to
         the normal strategy controlled by `optimize_probability` which happens
         every iteration. Using it as a mutation is useful if you want to use
+        a large `ncycles_periteration`, and may not optimize very often.
         Default is `0.0`.
     crossover_probability : float
         Absolute probability of crossover-type genetic operation, instead of a mutation.
         alpha: float = 0.1,
         annealing: bool = False,
         early_stop_condition: Optional[Union[float, str]] = None,
+        ncycles_per_iteration: int = 550,
         fraction_replaced: float = 0.000364,
         fraction_replaced_hof: float = 0.035,
         weight_add_node: float = 0.79,
         self.niterations = niterations
         self.populations = populations
         self.population_size = population_size
+        self.ncycles_per_iteration = ncycles_per_iteration
         # - Equation Constraints
         self.maxsize = maxsize
         self.maxdepth = maxdepth
             use_frequency_in_tournament=self.use_frequency_in_tournament,
             adaptive_parsimony_scaling=self.adaptive_parsimony_scaling,
             npop=self.population_size,
+            ncycles_per_iteration=self.ncycles_per_iteration,
             fraction_replaced=self.fraction_replaced,
             topn=self.topn,
             print_precision=self.print_precision,

pysr/test/params.py CHANGED Viewed

@@ -5,4 +5,4 @@ from .. import PySRRegressor
 DEFAULT_PARAMS = inspect.signature(PySRRegressor.__init__).parameters
 DEFAULT_NITERATIONS = DEFAULT_PARAMS["niterations"].default
 DEFAULT_POPULATIONS = DEFAULT_PARAMS["populations"].default
-DEFAULT_NCYCLES = DEFAULT_PARAMS["ncyclesperiteration"].default

 DEFAULT_PARAMS = inspect.signature(PySRRegressor.__init__).parameters
 DEFAULT_NITERATIONS = DEFAULT_PARAMS["niterations"].default
 DEFAULT_POPULATIONS = DEFAULT_PARAMS["populations"].default
+DEFAULT_NCYCLES = DEFAULT_PARAMS["ncycles_per_iteration"].default

pysr/test/test.py CHANGED Viewed

@@ -224,7 +224,7 @@ class TestPipeline(unittest.TestCase):
         # Test if repeated fit works:
         regressor.set_params(
             niterations=1,
-            ncyclesperiteration=2,
             warm_start=True,
             early_stop_condition=None,
         )
@@ -661,7 +661,7 @@ class TestMiscellaneous(unittest.TestCase):
         model = PySRRegressor(
             niterations=int(1 + DEFAULT_NITERATIONS / 10),
             populations=int(1 + DEFAULT_POPULATIONS / 3),
-            ncyclesperiteration=int(2 + DEFAULT_NCYCLES / 10),
             verbosity=0,
             progress=False,
             random_state=0,

         # Test if repeated fit works:
         regressor.set_params(
             niterations=1,
+            ncycles_per_iteration=2,
             warm_start=True,
             early_stop_condition=None,
         )
         model = PySRRegressor(
             niterations=int(1 + DEFAULT_NITERATIONS / 10),
             populations=int(1 + DEFAULT_POPULATIONS / 3),
+            ncycles_per_iteration=int(2 + DEFAULT_NCYCLES / 10),
             verbosity=0,
             progress=False,
             random_state=0,

pysr/test/test_warm_start.py CHANGED Viewed

@@ -78,7 +78,7 @@ class TestWarmStart(unittest.TestCase):
                         model.warm_start = True
                         model.niterations = 0
                         model.max_evals = 0
-                        model.ncyclesperiteration = 0
                         model.fit(X, y)

                         model.warm_start = True
                         model.niterations = 0
                         model.max_evals = 0
+                        model.ncycles_per_iteration = 0
                         model.fit(X, y)