Skip to content

find-symbol can fail with "Requested array size exceeds VM limit" #245

Open
@vemv

Description

@vemv

Hi there,

I've faced this issue repeatedly. In the past I had worked around it via :jvm-opts ["-Xmx18G"], but today I hit it again (long time no see!). I could try setting an even higher bound, but I guess a GH issue will be more useful.

The error is:

  • When I cljr-rename-symbol, through natural usage,
  • I get the attached stacktrace printed to Emacs' *Messages*
Fetching symbol occurrences...
Error in nrepl-refactor: java.lang.OutOfMemoryError: Requested array size exceeds VM limit
 at java.util.Arrays.copyOf (Arrays.java:3332)
    java.lang.AbstractStringBuilder.ensureCapacityInternal (AbstractStringBuilder.java:124)
    java.lang.AbstractStringBuilder.append (AbstractStringBuilder.java:448)
    java.lang.StringBuffer.append (StringBuffer.java:270)
    java.io.StringWriter.write (StringWriter.java:101)
    clojure.core$fn__7283.invokeStatic (core_print.clj:159)
    clojure.core/fn (core_print.clj:158)
    clojure.lang.MultiFn.invoke (MultiFn.java:234)
    clojure.core$pr_on.invokeStatic (core.clj:3674)
    clojure.core$pr_on.invoke (core.clj:3668)
    clojure.core$print_prefix_map$fn__7317.invoke (core_print.clj:233)
    clojure.core$print_sequential.invokeStatic (core_print.clj:66)
    clojure.core$print_prefix_map.invokeStatic (core_print.clj:229)
    clojure.core$print_map.invokeStatic (core_print.clj:238)
    clojure.core$fn__7344.invokeStatic (core_print.clj:266)
    clojure.core/fn (core_print.clj:263)
    clojure.lang.MultiFn.invoke (MultiFn.java:234)
    clojure.core$pr_on.invokeStatic (core.clj:3674)
    clojure.core$pr_on.invoke (core.clj:3668)
    clojure.core$print_prefix_map$fn__7317.invoke (core_print.clj:233)
    clojure.core$print_sequential.invokeStatic (core_print.clj:66)
    clojure.core$print_prefix_map.invokeStatic (core_print.clj:229)
    clojure.core$print_map.invokeStatic (core_print.clj:238)
    clojure.core$fn__7344.invokeStatic (core_print.clj:266)
    clojure.core/fn (core_print.clj:263)
    clojure.lang.MultiFn.invoke (MultiFn.java:234)
    clojure.core$pr_on.invokeStatic (core.clj:3674)
    clojure.core$pr_on.invoke (core.clj:3668)
    clojure.core$print_prefix_map$fn__7317.invoke (core_print.clj:233)
    clojure.core$print_sequential.invokeStatic (core_print.clj:66)
    clojure.core$print_prefix_map.invokeStatic (core_print.clj:229)
    clojure.core$print_map.invokeStatic (core_print.clj:238)

Interestingly, I can reproduce the error via the repl, skipping the middleware.

(refactor-nrepl.find.find-symbol/find-symbol   {:file "/path/to/file/page.clj",
                                                :ns "some.ns"
                                                :name "handler"
                                                :line 43
                                                :column 6
                                                :ignore-errors false})

...and further playing at the repl, I could find the exact expression at fault:

(->> (core/dirs-on-classpath)
(mapcat (partial core/find-in-dir (some-fn core/clj-file? core/cljc-file?)))
(mapcat (partial find-symbol-in-file fully-qualified-name ignore-errors)))))

I can run for example

(->> (core/dirs-on-classpath)
     (mapcat (partial core/find-in-dir (some-fn core/clj-file? core/cljc-file?)))
     (mapcat (partial find-symbol-in-file "clojure.core/+" false))
     count)

And it will reliably OOM.

Let me know if I can help you reproduce the error again. Perhaps you have it easy by setting a deliberately small -Xmx.

Victor

Activity

vemv

vemv commented on Mar 28, 2019

@vemv
MemberAuthor

Side note, the pinpointed code seems a good target for parallelization. Perhaps if done with reducers, the change wouldn't be too invasive.

vemv

vemv commented on Mar 28, 2019

@vemv
MemberAuthor

One last note, (->> (core/dirs-on-classpath) (mapcat (partial core/find-in-dir (some-fn core/clj-file? core/cljc-file?))) count) returns 276 for me, each file having a reasonable size. i.e. it's not an humongous project.

vemv

vemv commented on Mar 28, 2019

@vemv
MemberAuthor

I further debugged this.

  • The culrpit boils down to refactor-nrepl.find.find-symbol/find-symbol-in-file
    • Specifically, its (ana/ns-ast file-content) call
  • One specific input that will cause it to OOM is the following:
(defn foo
  [{:keys [x]}]
  (go-loop []
    ))

...Namely:

  • clojure.core.async/go-loop must be there
  • A destructured argument must also be there.

You can apply this patch, and easily reproduce the issue:

  • lein repl
  • (require 'refactor-nrepl.analyzer)
  • (refactor-nrepl.analyzer/ns-ast (slurp "test/resources/testproject/src/com/example/vemv.clj")) -> OOM
benedekfazekas

benedekfazekas commented on Mar 28, 2019

@benedekfazekas
Member

thanks for the really good analysis. we have several probs with core.async as far as i remember due to its macro heavy nature. will have a closer look, ta

vemv

vemv commented on Apr 9, 2019

@vemv
MemberAuthor

Hey there,

have you had a chance (or foresee one soon enough)?

Btw, this issue report is a bit messy, it boils down to the last comment. I can close + re-create this ticket if that tidies up things.

slipset

slipset commented on Dec 7, 2019

@slipset

I've been hit by this as well.
On our largish codebase (70k lines of Clojure), it seems like the caching of the ast done in
refactor-nrepl.analyzer fails to handle the complete ast of our project (at least on my machine)

When commenting out https://github.com/clojure-emacs/refactor-nrepl/blob/master/src/refactor_nrepl/analyzer.clj#L53 it seems to run without the OOM, which I think makes sense, as I can imagine the total ast for this project would be rather large?

expez

expez commented on Dec 9, 2019

@expez
Member

@slipset I just had the same error occur on a smaller project (8k lines of clojure).

error in process filter: Error in nrepl-refactor: java.lang.OutOfMemoryError: Java heap space
 at java.util.Arrays.copyOf (Arrays.java:3332)
    java.lang.AbstractStringBuilder.ensureCapacityInternal (AbstractStringBuilder.java:124)
    java.lang.AbstractStringBuilder.append (AbstractStringBuilder.java:448)
    java.lang.StringBuffer.append (StringBuffer.java:270)
    java.io.StringWriter.write (StringWriter.java:101)
    cider.nrepl.print_method$eval31775$fn__31776.invoke (print_method.clj:69)
    clojure.lang.MultiFn.invoke (MultiFn.java:233)
    clojure.core$pr_on.invokeStatic (core.clj:3666)
    clojure.core$pr_on.invoke (core.clj:3660)
    clojure.core$print_prefix_map$fn__7041.invoke (core_print.clj:233)
    clojure.core$print_sequential.invokeStatic (core_print.clj:61)
    clojure.core$print_prefix_map.invokeStatic (core_print.clj:229)
    clojure.core$print_map.invokeStatic (core_print.clj:238)
    clojure.core$fn__7068.invokeStatic (core_print.clj:266)
    clojure.core/fn (core_print.clj:263)
    clojure.lang.MultiFn.invoke (MultiFn.java:233)
    clojure.core$pr_on.invokeStatic (core.clj:3666)
    clojure.core$pr_on.invoke (core.clj:3660)
    clojure.core$print_prefix_map$fn__7041.invoke (core_print.clj:233)
    clojure.core$print_sequential.invokeStatic (core_print.clj:61)
    clojure.core$print_prefix_map.invokeStatic (core_print.clj:229)
    clojure.core$print_map.invokeStatic (core_print.clj:238)
    clojure.core$fn__7068.invokeStatic (core_print.clj:266)
    clojure.core/fn (core_print.clj:263)
    clojure.lang.MultiFn.invoke (MultiFn.java:233)
    clojure.core$pr_on.invokeStatic (core.clj:3666)
    clojure.core$pr_on.invoke (core.clj:3660)
    clojure.core$print_prefix_map$fn__7041.invoke (core_print.clj:233)
    clojure.core$print_sequential.invokeStatic (core_print.clj:61)
    clojure.core$print_prefix_map.invokeStatic (core_print.clj:229)
    clojure.core$print_map.invokeStatic (core_print.clj:238)
    clojure.core$fn__7068.invokeStatic (core_print.clj:266)

This is the same stacktrace as in the OP, but the error is slightly different.

Looking at the stacktrace print_prefix_map occurs 6 times. Perhaps there's a problem with cycles here?

mbuczko

mbuczko commented on Dec 17, 2019

@mbuczko
Contributor

@slipset @vemv could you try to reproduce the problem against newest master?

vemv

vemv commented on Dec 18, 2019

@vemv
MemberAuthor

Hi, thanks for attending the issue!

Yes I could still reproduce it.

Updated repro instuctions are as follow:

  • git checkout fresh https://github.com/clojure-emacs/refactor-nrepl master
  • curl https://gist.githubusercontent.com/vemv/a01b63102c3fb6b19d37f376e78546fd/raw/74d064cac449f6063f4751f6fa9f3e8e8fe53899/gistfile1.txt | git apply
  • (require 'refactor-nrepl.analyzer)
  • (refactor-nrepl.analyzer/ns-ast (slurp "testproject/src/com/example/vemv.clj")) -> OOM

My env:

  • Personal .lein profile stuff was left nil
  • High-end MBP
mbuczko

mbuczko commented on Dec 23, 2019

@mbuczko
Contributor

I think this is still related to the problem I tried to solve in #278. (refactor-nrepl.analyzer/ns-ast (slurp "test/resources/testproject/src/com/example/vemv.clj")) at some point tries to stringify the result to display it in REPL I guess, and we painfully end up with OOM.

Let's try the other way:

(def ast 
   (refactor-nrepl.analyzer/ns-ast (slurp "test/resources/testproject/src/com/example/vemv.clj")))

if we now try to stringify the result, like this:

(str ast)

again - we get OOM. This is more or less what I fixed in #278.

Lets instead redirect output directly to file:

(with-open [w (clojure.java.io/writer "/tmp/ast.txt")]
  (binding [*out* w] (pr ast)))

Now, after few minutes (and assuming you had enough free space on your drive) you can see where is the problem . Generated file is gigantic. I interrupted entire dump after reaching 37GB.

That's insane :)

expez

expez commented on Jun 28, 2021

@expez
Member

@vemv Any hope of narrowing this done? I'm not sure how to tackle this one, but it's probably one of the most important outstanding issues in refactor-nrepl atm.

Since eastwood manages to use tools.analyzer on this code I'm guessing we're doing something we shouldn't somewhere, much like the bug @mbuczko closed in #278.

vemv

vemv commented on Jun 28, 2021

@vemv
MemberAuthor

As a thing that maybe can help, this wall of flags helps me avoid all sort of OOMs (most specifically, for stack-heavy workloads):

[;; Prevents a specific type of OOMs:
 "-XX:CompressedClassSpaceSize=3G"
 ;; Prevents trivial StackOverflow errors:
 "-XX:MaxJavaStackTraceDepth=1000000"
 ;; Set a generous limit as the maximum that can be allocated, preventing certain types of OOMs:
 "-Xmx24G"
 ;; increase stack size x6, for preventing SO errors:
 ;;   (The current default can be found with
 ;;    `java -XX:+PrintFlagsFinal -version 2>/dev/null | grep "intx ThreadStackSize"`)
 "-Xss6144k"]

Also @slipset's comment seems on point. Keeping an AST cache seems a delicate business: outdated data can grow without control, correct? So something like a core.cache SoftCache might be more cautious.

Or keeping things simpler, one could have a per-run cache: things would be cached for a specific refactor-nrepl op, but not across ops. I realise one will have more cache misses this way, but one gains simplicity and a reduced chance of OOMs.

Finally, one nice thing I came up for Eastwood recently is jonase/eastwood@aa569a5 . It might not be related directly to our issue here, but omitting core.async exceptions can be good in other contexts / for other issues.

expez

expez commented on Jun 28, 2021

@expez
Member

outdated data can grow without control, correct?

Looks like we have a bit of a bug in the cache, in the sense that it's append-only and old shit is never thrown out. On the other hand it won't grow on it's own. Unless you're using the refactoring a lot and/or have a very large change set I doubt this is it. I'll add another issue tracking that (should be very easy to fix), but I think this one here is something different.

edit: I just did a quick read-through of the cache code and it looks OK. It won't grow out of bounds. We also know that a single file can produce an AST that's too large to handle. Guess the next step might be to see what sort of recursion happens when such an AST is created...

vemv

vemv commented on Jun 28, 2021

@vemv
MemberAuthor

Yes the #245 (comment) repro hints so

4 remaining items

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

    Development

    No branches or pull requests

      Participants

      @mbuczko@benedekfazekas@expez@vemv@slipset

      Issue actions

        find-symbol can fail with "Requested array size exceeds VM limit" · Issue #245 · clojure-emacs/refactor-nrepl