Simpqs-2.0 full#48
Conversation
Add a balancing _destroy_factor() to free memory grabbed by _init_factor().
To be reversed after edits are done.
_GMP_init/destroy; use MPUG prime iterators; use MPUG verbosity; remove REPORT (since it is handled by verbosity). The standalone program now takes the number to factor and optional verbosity as command-line arguments.
QS factorization failure has been observed due to primecount[] overflowing a 16-bit unsigned short. We end up setting primecount[] values to a count of relations, of which there are (of the order of) the number of primes in the factorbase, so it should be of the same type as primesNo[].
Contents will be mpz_init()ed
Possible bugs or opportunities for improvement.
1 complaint for 7520587276150280509023659781989524950620229823 from l = 1936 7 complaints for 1000000000000000000000000000045999999999999999999999999999373 from l in (5937..5943)
mpz_t q, r are used only to hold M /% CACHEBLOCKSIZE, with M <= 192000. Replace them with unsigned long Mq, Mr.
This wasn't important before, since it is used as a cut-off for partials.
Matrix stored in la_col_t, full and partial relations stored in rel_t, collected in dynamic array arel_t. Block Lanczos code imported almost unchanged from simpqs-2.0, just minor changes to fix memory leaks; relations structure added by hv to replace temp files used in simpqs.
Another change incorporated from simpqs-2.0.
Whitespace and minor comment fixups only.
|
I note that this occasionally shows: This does not seem to cause a problem, and it seems one or two retries is always enough to succeed; however the noise to stdout is a bit irritating, and should probably hide behind a verbosity level. I'll try to get some stats on how often this happens, and whether it depends on the size of the matrix. (For a factor base of size |
It appears to be normal for the algorithm to fail occasionally: about 4% of the time in tests (and more often on a given matrix after it has failed once). Allow for this by a) retrying and b) reporting only if verbosity > 3. Introduce a hard limit of 100 retries, just in case.
|
It appears that "not all columns used" is relatively common, and "Failed, no rows found (mask == 0)" is rarer but also possible (and should not be fatal). I've added a commit to cope with those more cleanly. I assume the remaining check "lanczos error: dependencies don't work" would represent an actual bug, and have left that as a fatal error. I ran my code using this for a while with extra instrumentation. There were 1700 numbers that reached Since the evaluation time is typically about 15-20% of the total (slightly larger for the smallest inputs), this implies runtimes about 1% longer on average than with no failures. There was no obvious correlation with size of target, and seeking |
|
Sigh, I've now found a case where - for a given state of the (QS, not block_lanczos) rng - it fails even after 10000 iterations. Analyzing the input matrix, it appears to have an unluckily high number of linearly-dependent entries: it needs For the record, the failure occurs when the QS rng starts with (I am tempted to reset the rngs for each new factorization to aid reproducibility - I was lucky that this failure occurred just 30 minutes into a run - but I think it is unwise. I'll probably instead combine the two simplistic rngs into one, and capture the initial seed to output in the case of eventual failure instead.) |
Rename to simple_random(), save the initial seed for diagnostics, and when standalone allow the seed to be set with '-r'. Running with "./qs -r1935087715 190484846390921103486305577532825800073" demonstrates a seed-specific failure.
We don't care about the sign, and this ensures we also remove duplicates that differ only by sign. This is probably a bugfix, since enough of such duplicates can cause the algorithm to fail. This is enough to fix the example failure mentioned in 3c6cc9311c; it is not yet known if it fixes all such failures.
We expect maybe 10% of the primes to have a count of zero, so it's worth checking for.
Generate relations combined from partials directly into existing list rather than keeping them separate. This also ensures we check for duplicates between the natural and combined relations.
|
Ah, now fixed: the source value ( I note that this makes the two largest testcases about 10% slower. I assume they also had a bunch of such duplicates - not enough to leave the matrix underspecified, but enough that removing them causes significant extra work to make up the shortfall. In the future it may be worth experimenting to see if we can systematically demand fewer relations without risking underspecification. I'll continue running with this (updated) code to keep an eye out for any other problems. |
|
FWIW, here's an updated version of the tables above, adding build D (MPUG with latest commits up to c7c5a4e) and for comparison build E (latest msieve run with |
The former is a BSDism not universally available (eg on Cygwin), and we already use the latter elsewhere.
|
I've been using this implementation without problems for the last 18 months, so I have high confidence in it. If it is of interest to anybody I can update it for latest master: it needs minor adaptations to rebase. |
TLDR: quadratic sieve gets 50% speedup for large and medium inputs, some (smaller) speedup for small inputs; memory use 4x smaller for large inputs, roughly unchanged for medium and small inputs.
Incorporate block Lanczos, partial relations and sieving speedups from simpqs-2.0. There are some additional minor fixes/optimizations along the way, and comments marked '(hv)' where I think there may be additional bugs or infelicities that I was not able to resolve. Results from timing tests below show user CPU time and maxresident from
/usr/bin/timeon an unloaded machine.Build A: MPUG master @2389dcbc44, fixed to build standalone
- fails to find factors for ds=88 (asterisked) due to bug fixed in 955bb25
Build B: MPUG with simpqs-2.0, without last commit (midprime sieving)
Build C: MPUG with simpqs-2.0, with last commit
Test cases:
ds=46: 7520587276150280509023659781989524950620229823
= 342594914346335019487 * 21951835713905522424868129
ds=61: 1000000000000000000000000000045999999999999999999999999999373
= 999999999999999999999999999989 * 1000000000000000000000000000057
ds=70: 9999999999999999999999999999890000000120999999999999999999999999998669
= 999999999999999999999999999989 * 10000000000000000000000000000000000000121
ds=80: 99999999999999999999999999999999999941006299999999999999999999999999999999996283
= 99999999999999999999999999999999999941 * 1000000000000000000000000000000000000000063
ds=88: 9180435894215652020112626451519741807924491872283194331211065824189266475562769247511369
= 1059405511716795159376523757718030895723651 * 8665648604506982829490012915377592720602462019