Open
Description
int main(int argc, char** argv)
{
int port;
FILE* fp;
/* Cmd.-line args */
if (argc != 4 ||
(fp = fopen(argv[1], "r")) == 0 ||
((port = atoi(argv[3])) & ~0xFFFF)) {
fputs("\nUSAGE:\n"
"test_fw <inp_file> <host> <port>\n\n", stderr);
return 1; // <== Opened stream never closed. Potential resource leak
}
...
/* some work is done here */
...
}
Return from main()
with an open file does not pose any resource leaks, as the file gets closed by C RTL right away.
It if was any other function, then yes, in such a situation the report would be justified and true, but not for main()
.
Metadata
Metadata
Assignees
Type
Projects
Milestone
Relationships
Development
No branches or pull requests
Activity
[-]False-positive: fopen() without fclose() in main() reported as a "resource leak"[/-][+]False-positive: fopen() without fclose() in main() reported as a "resource leak" clang:static analyzer[/+]llvmbot commentedon Apr 1, 2025
@llvm/issue-subscribers-clang-static-analyzer
Author: None (alavrentiev)
...
/* some work is done here */
...
}
chrchr-github commentedon Apr 2, 2025
What if
main()
leaked e.g. 1000 handles in a loop? Is there a reason not to usefclose()
?firewave commentedon Apr 2, 2025
If this would occur in code run with
fork()
orexec*()
, wouldn't that leak still manifest itself?alavrentiev commentedon Apr 2, 2025
@firewave : the report is specifically about
return
frommain()
; not any other ways to leave it.alavrentiev commentedon Apr 2, 2025
Without a
FILE*
variable reassigned or reinstated (loop, block, etc) there's no leak inmain()
, and that is easy to logically distinguish from any other situation / function.So if
main()
has an array of 1000FILE*
s and opens them all, then does some work, andreturn
s, there's no leak, either.Yes, there is. If
main()
has multiple (many!) exit points (return
s) while there's aFILE*
previously opened, there's no need to usefclose()
at each and every one of them, because that's what the language offers to do.firewave commentedon Apr 2, 2025
exec*()
andspawn*()
can be used to run another executable from a process. I am wondering if file descriptors or memory from those subprocesses might be inherited by the parent process and thus accumulate the leaks from subprocesses which do not properly clean up.alavrentiev commentedon Apr 2, 2025
@firewave : none of these calls create a "subprocess" (in the meaning that it's some sort of being a part of the original one).
exec()
completely replaces the entire process memory with a new image (file descriptors remain open, but NOTFILE*
).fork()
splits (copies) the process into two independent entities, each with its own memory. So whatever the copy does with its resources is not going to affect the original (parent) process, and vice versa.But let us not digress from the issue at hand, which is: when
main()
returns then the leakage ofFILE*
s (or file descriptors) should be handled more gratuitously than in any other C function under the same circumstances.firewave commentedon Apr 2, 2025
Thanks for elaborating. I have only ever used it in the most basic and default way and I am not familiar with all the behavior implied by the various variants and flags.
I brought this up because I have seen issues where file descriptor exhaustion was trigger by underlying processes. It has been a while though so my memory is very faint.
steakhal commentedon Apr 2, 2025
Usually, special-casing
main
leads to marginal benefits because it's a tiny narrow subset of the cases when we ever seemain
.W.r.t. this bug, it appears to me that it's also fairly easy to spot this sort of "false-positive" so it shouldn't be a huge pain point.
This makes this for me a less important issue.
We could however check if we are directly in
main
before constructing the bug report and just skip it if that's the case. That should be fairly easy to implement.steakhal commentedon Apr 2, 2025
@Flandini Do you wanna give it a try?
vakatov commentedon Apr 2, 2025
Because the Static Analyzer (SA) may actually (in the future, if not already) analyze code pathways which run through different functions (and even different compilation units), then -- won't the suggested approach basically mean that SA must not track any
FILE*
's (or bare file descriptors), period? Also, will it include sockets, etc?alavrentiev commentedon Apr 2, 2025
@firewave
JFTR, there may be a system-wide limit of simultaneously open files, which can affect any process, regardless of their hierarchal dependencies.
NagyDonat commentedon Apr 3, 2025
The analyzer does already analyze code pathways that run thorough different functions (essentially it "inlines" their definitions) and it has a
--ctu
mode that enables Cross-Translation Unit analysis which can inline functions from other translation units; but even in these cases there are limits on the analysis depth, so it can only access a minority of the code through main, so there are many other analysis runs that start from different functions.vakatov commentedon Apr 3, 2025
My point is that all functions are ultimately called from
main()
; so I am not sure that I see real difference betweenfopen()
'ing inmain()
vs anywhere else. E.g.:int main() { ....; fopen(...); ....}
versus:
void foo() { ....; fopen(...); ....}
int main() { ....; foo(); ....}
SEt-t commentedon Apr 4, 2025
I also think the same. This proposal effectively says that no
fopen()
should be tracked as eventual exit frommain()
would close the handles.fopen()
inmain()
is not a special case in C.alavrentiev commentedon Apr 4, 2025
@SEt-t :
That's not true. In the above [current version] from @vakatov, all use cases are leaks (for there are no assignments). In the previous version, only cases 2 and 4 were leaks (because no other code pathways were trackable for them -- and code may attempt to open 1000's more of such descriptors, creating potentially bad consequences). And 1 and 3 were not -- we know they were taken care of (albeit implicitly).
A return from
main()
is equivalent toexit()
-- so does code analyzer "complain" if there's anexit()
call somewhere else in the code that has an activeFILE*
open? I don't think so. Becauseexit()
(orabort()
, for that matter) has well-defined no-return semantics, and so does areturn
inmain()
.main()
is a special function in the language and should be treated as such.SEt-t commentedon Apr 4, 2025
What you think the leak is is completely irrelevant to the language. Assigning the result of
fopen()
somewhere or not doesn't change that the handle isn't directly passed to releasing function. So in terms of resource leaks those cases are equivalent.Let's look at some cases:
f()
callsfopen()
and returns that handle.main()
callsf()
and doesn't assign the result anywhere.main()
callsf()
and assigns the result to variable that is never used after that.main()
callsf()
and assigns the result to variable that is used to work with file but never explicitlyfclose()
'd.f()
is called from functiong()
instead ofmain()
,main()
consists of single linereturn g();
.Which cases have resource leak by your "definition"?
alavrentiev commentedon Apr 6, 2025
@SEt-t :
In my definition (no quotes), a resource leak is an irrecoverable use of some limited supply, which can lead to irrecoverable exhaustion of such.
In your examples, the source code analyzer is perfectly capable to tracking that 2 is a leak, because
f()
returns a handle, which is lost, i.e. its resource is no longer recoverable (e.g. suppose thatf()
is called again beforemain()
terminates -- that's the leaking accumulation). 3 and 4 are not leaks because the handles inmain()
keep referencing that veryFILE*
, andmain()
takes care of it by either doing an explicitfclose()
or closing it implicitly when it terminates, in either case leading to a controlled recovery of the resource.In 5, 6, and 7, I assume
g()
returns anint
(to be compatible with whatreturn
frommain()
is expected to have), so they are all leaks because the handle had not been previously closed, and is now lost. Should suchg()
be called somewhere in a loop, for example, all those handles thatf()
produced could obviously exhaust the availableFILE
pointers or the underlying file descriptors, and the recovery is impossible.If, however,
g()
returned the result off()
to be used inmain()
asreturn g() ? 0/*success*/ : 1/*failure*/;
, then there's no leak, as again the handle is still trackable [albeit implicitly in this case], andmain()
is going to close it (upon executing thatreturn
statement). Now, if this very samereturn
statement was used in any other function, however, then it would constitute a resource leak again [because at the return point in such a function, the handle would become actually lost]. Which is why I keep saying thatmain()
is a special function, whosereturn
should be regarded as equivalent toexit()
.SEt-t commentedon Apr 7, 2025
I think I see where you get it wrong:
C handles are just opaque values with no special behavior (like cleanup). Absolutely nowhere in standard it's said that
main()
implicitly closes any handles. Only that return frommain()
implicitly callsexit()
with argument of the returned value (not any magic handles!):And then
exit()
does the cleanup:So, 2 and 3 are indistinguishable in terms of resource cleanup. (As well as 4-7.)
steakhal commentedon Apr 7, 2025
I wonder if the same argument would hold if
main
is a function with thousands of lines of code. Should we ignore leaks there too?Or what if
main
recurses and calls itself. I believe thats valid in C. Then should we consider leaks?SEt-t commentedon Apr 7, 2025
@steakhal
I strongly feel that auto-closing of FILE handles should not be relied upon and warning should be issued. What if FILE is opened in a loop of runtime-read N iterations? Small N can work just fine but not large ones. And yes, recursive
main()
is valid in C. So it's better to be consistent and not special-casemain()
.vakatov commentedon Apr 7, 2025
Probably there should be some kind of formal "golden rule" about the Static Analyzer’s expectations vis-à-vis resource management by the user code, something like:
All resources (heap memory, file descriptors, etc.) which are allocated as a result of the user code action and whose handles are in any way (directly or indirectly) returned to be owned by the user… must be deallocated (or scheduled/registered to be deallocated) by the user code.
steakhal commentedon Apr 9, 2025
That's pretty much the sentiment, yes.