Closed
Bug 663423
Opened 14 years ago
Closed 14 years ago
Memory usage reports via telemetry
Categories
(Core :: General, defect)
Core
General
Tracking
()
RESOLVED
FIXED
mozilla7
People
(Reporter: n.nethercote, Assigned: n.nethercote)
References
Details
(Whiteboard: [MemShrink:P1][inbound])
Attachments
(4 files, 2 obsolete files)
4.75 KB,
patch
|
taras.mozilla
:
review+
|
Details | Diff | Splinter Review |
1.67 KB,
patch
|
n.nethercote
:
review+
|
Details | Diff | Splinter Review |
730 bytes,
patch
|
justin.lebar+bug
:
review+
|
Details | Diff | Splinter Review |
2.43 KB,
patch
|
taras.mozilla
:
review+
|
Details | Diff | Splinter Review |
Now that telemetry is working, it'd be great to get some kind of data gathering and analysis relating to memory usage.
Comment 1•14 years ago
|
||
We have *some* kind of reporting already. http://people.mozilla.com/~tglek/telemetry/log.html lists aggregate info on some of the about:memory stuff
![]() |
Assignee | |
Comment 2•14 years ago
|
||
Great! So this bug is a matter of working out what more we can do with that data, if we can automate any kind of tracking, etc.
Comment 3•14 years ago
|
||
(In reply to comment #2)
> Great! So this bug is a matter of working out what more we can do with that
> data, if we can automate any kind of tracking, etc.
Well, I put those probes in as an example. I would be happy if you could suggest refinements to that or a better approach altogether.
![]() |
Assignee | |
Updated•14 years ago
|
Blocks: MemShrinkTools
Comment 4•14 years ago
|
||
Here's the list of currently-tracked memory reporters, from TelemtryPing.js:
"explicit/js/gc-heap": [1024, 512 * 1024, 10],
"resident": [32 * 1024, 1024 * 1024, 10],
"explicit/layout/all": [1024, 64 * 1025, 10],
![]() |
Assignee | |
Comment 5•14 years ago
|
||
We should do "explicit" as well, that's arguably the single most interesting number. That'll require bug 660731.
Assignee: nobody → nnethercote
Depends on: 660731
![]() |
Assignee | |
Comment 6•14 years ago
|
||
http://blog.mozilla.com/tglek/2011/06/22/developers-how-to-submit-telemetry-data/ is useful info for this bug.
![]() |
Assignee | |
Comment 7•14 years ago
|
||
explicit/js/gc-heap-chunk-unused (from bug 661474) would also be interesting, for an indication of fragmentation on the JS heap.
Comment 8•14 years ago
|
||
(In reply to comment #7)
> explicit/js/gc-heap-chunk-unused (from bug 661474) would also be
> interesting, for an indication of fragmentation on the JS heap.
is there a reason you didn't include a telemetry probe in that bug?
![]() |
Assignee | |
Comment 9•14 years ago
|
||
(In reply to comment #8)
>
> is there a reason you didn't include a telemetry probe in that bug?
I am planning to do all the new telemetry additions in a single patch here :)
![]() |
Assignee | |
Comment 10•14 years ago
|
||
So the memory reporting in telemetry is completely busted. This first patch fixes it. Is this stuff tested anywhere?
Attachment #543339 -
Flags: review?(tglek)
![]() |
Assignee | |
Comment 11•14 years ago
|
||
This patch:
- Adds three extra memory measurements to telemetry.
- Removes |memReporters| from |gatherMemory|, because it is dead.
It relies on mgr.explicit which is added by the patch in bug 660731.
Attachment #543340 -
Flags: review?(tglek)
![]() |
Assignee | |
Comment 12•14 years ago
|
||
BTW, I didn't do explicit/js/gc-heap-chunk-unused because that's computed via a multi-reporter (bug 666075) and there's currently no way to extract a single number from a multi-reporter.
Updated•14 years ago
|
Attachment #543339 -
Flags: review?(tglek) → review+
Updated•14 years ago
|
Attachment #543340 -
Flags: review?(tglek) → review+
Comment 13•14 years ago
|
||
(In reply to comment #10)
> Created attachment 543339 [details] [diff] [review] [review]
> patch 1
>
> So the memory reporting in telemetry is completely busted. This first patch
> fixes it. Is this stuff tested anywhere?
Not yet. I'm not sure of the best way to test it. I was thinking of adding a NS_WARNING to the for loop if every memory reporter wasn't found (or if some of them are reporting 0, which seems crazy). Turns out there isn't an NS_WARNING equivalent from JS.
![]() |
Assignee | |
Comment 14•14 years ago
|
||
(In reply to comment #13)
> Turns out there isn't an NS_WARNING equivalent from JS.
JS_ASSERT! (Yes, I'm serious; if this is busted I want to know, not have a warning that gets lost in the 100s of other warnings :)
Comment 15•14 years ago
|
||
(In reply to comment #14)
> (In reply to comment #13)
> > Turns out there isn't an NS_WARNING equivalent from JS.
>
> JS_ASSERT! (Yes, I'm serious; if this is busted I want to know, not have a
> warning that gets lost in the 100s of other warnings :)
I meant from .js
Comment 16•14 years ago
|
||
Also while we do this, can we check if val != 0 before adding it to a histogram, I don't know why MEMORY_RESIDENT would be 0 so often.
Comment 17•14 years ago
|
||
The above patches use NS_ASSERT, which isn't available in this module(and is apparently dangerous too as it shows up as a modal dialog).
Turns out there is no good way to assert(that I can see) from js.
Attachment #543528 -
Flags: review?(nnethercote)
Comment 18•14 years ago
|
||
I don't want to take the risk of branching for aurora with broken memory reporters. Lets checkin the fix right away.
Attachment #543339 -
Attachment is obsolete: true
Attachment #543529 -
Flags: review?(justin.lebar+bug)
Comment 19•14 years ago
|
||
Comment on attachment 543529 [details] [diff] [review]
patch 1(for checkin)
>+ if (val)
>+ h.add(val);
Why are we excluding zero values? It's perfectly valid for the number of hard page faults since the last sample to be 0, for instance.
Comment 20•14 years ago
|
||
(In reply to comment #19)
> Why are we excluding zero values? It's perfectly valid for the number of
> hard page faults since the last sample to be 0, for instance.
Taras and I had a long argument about this on IRC.
He thinks that 0 values aren't interesting, because they're a non-event. In the unlikely scenario that you want to compute the proportion of time spent not swapping versus swapping, you can infer the number of 0s that are missing by looking at some other memory reporter. Also, since presumably the 0s bucket will be much larger than the other buckets, including it will make the histogram ugly.
I think the histogram makes no sense without the 0 values. You *always* want to compare the number of swap events to the number of non-swap events. Otherwise you're just learning a number which is strongly correlated with how long the browser has been running. If the histogram can't display this well, then it's bugged.
Anyway, Taras needs this hack in there because some other reporters aren't working properly -- RSS reports 0 about half the time, for reasons which are unclear to me. We agreed that the patch gets r=me if this check is called out as a hack and we commit to fix it in the next cycle.
Updated•14 years ago
|
Attachment #543529 -
Flags: review?(justin.lebar+bug) → review+
Comment 21•14 years ago
|
||
http://hg.mozilla.org/integration/mozilla-inbound/rev/e5de7146ac19
please don't close this bug on mc merge :)
Comment 22•14 years ago
|
||
> You *always* want to compare the number of swap events to the number of non-swap events.
For instance, I might want to know: "Is FF becoming less swappy for users?" I'd compute this by comparing the number of events below a threshold (say less than 1000 faults / min) to those above a threshold. The 0s are crucial here.
(This further shows why it doesn't make sense to exclude 0s: A value of 0 faults / min isn't a lot different from 2 faults / min. Yet only one makes it into the histogram?)
Comment 23•14 years ago
|
||
Also, WRT the point that you can infer the number of 0s by examining another memory reporter: That only works if the memory reporters aren't randomly returning 0. :)
![]() |
Assignee | |
Comment 24•14 years ago
|
||
Comment on attachment 543528 [details] [diff] [review]
enforce mem reporters
Review of attachment 543528 [details] [diff] [review]:
-----------------------------------------------------------------
Attachment #543528 -
Flags: review?(nnethercote) → review+
Comment 25•14 years ago
|
||
Target Milestone: --- → flash10
Updated•14 years ago
|
Target Milestone: flash10 → ---
Comment 26•14 years ago
|
||
I tested this and didn't see 0s coming from anywhere except hard-page-faults, where it's expected.
If RSS is still reporting 0 for some unknown reason, I don't think we should hide it, as we currently do. Since it's possible that the 0s are reported in place of certain values (as we don't know where they were coming from), taking out the 0s entirely kind of invalidates the RSS measure.
If you really feel strongly that 0 hard page faults in an interval shouldn't be reported, then let's exclude 0s for UNITS_COUNT, but not for other units. Although I still feel like that's entirely the wrong thing to do. Maybe njn can referee...
Updated•14 years ago
|
Attachment #543613 -
Attachment description: UNITS_COUNTS fix. → UNITS_COUNT fix.
Comment 27•14 years ago
|
||
Comment on attachment 543613 [details] [diff] [review]
UNITS_COUNT fix.
Taras, do you think we can get this in before we branch?
Attachment #543613 -
Flags: review?(tglek)
Comment 28•14 years ago
|
||
Comment on attachment 543613 [details] [diff] [review]
UNITS_COUNT fix.
>- // hack to deal with some memory reporters returning 0
>- if (val)
>- h.add(val);
>+ h.add(val);
This is still needed for UNITS_BYTES.
Attachment #543613 -
Flags: review?(tglek) → review-
Comment 29•14 years ago
|
||
Comment on attachment 543613 [details] [diff] [review]
UNITS_COUNT fix.
Maybe njn can arbitrate; would you be OK with that, Taras?
Attachment #543613 -
Flags: feedback?(nnethercote)
![]() |
Assignee | |
Comment 30•14 years ago
|
||
I'd like to include zero counts for byte measurements. We should never get zero for them, AFAIK -- memory reporters return -1 if they fail -- so if we do that is interesting, and probably a sign something is buggy.
Comment 31•14 years ago
|
||
I think we all agree with including the zeros for byte measurements. What about for counts?
![]() |
Assignee | |
Comment 32•14 years ago
|
||
Comment on attachment 543613 [details] [diff] [review]
UNITS_COUNT fix.
Review of attachment 543613 [details] [diff] [review]:
-----------------------------------------------------------------
Attachment #543613 -
Flags: feedback?(nnethercote) → feedback+
Comment 33•14 years ago
|
||
(In reply to comment #30)
> I'd like to include zero counts for byte measurements. We should never get
> zero for them, AFAIK -- memory reporters return -1 if they fail -- so if we
> do that is interesting, and probably a sign something is buggy.
Ok, so r+ with a fix that checks for -1 and does a continue.
Math.round(-1/1024)==0 :)
![]() |
Assignee | |
Comment 34•14 years ago
|
||
(In reply to comment #31)
> I think we all agree with including the zeros for byte measurements. What
> about for counts?
I don't think we should throw away any valid measurements, ie. measurements where nothing went wrong when taking the measurement.
Do we need to check for -1 for page fault counts?
Comment 35•14 years ago
|
||
> Do we need to check for -1 for page fault counts?
Yes, they return -1 on failure.
Comment 36•14 years ago
|
||
Attachment #543627 -
Flags: review?(tglek)
Updated•14 years ago
|
Attachment #543613 -
Attachment is obsolete: true
Comment 37•14 years ago
|
||
Comment on attachment 543627 [details] [diff] [review]
Fix for UNITS_COUNT v2
Thanks, happy to see a proper fix for this.
Please triplecheck that this works as expected on your machine.
Attachment #543627 -
Flags: review?(tglek) → review+
Comment 38•14 years ago
|
||
Checked one last time that it works on my machine, and pushed to inbound.
http://hg.mozilla.org/integration/mozilla-inbound/rev/016fcb800fc2
Updated•14 years ago
|
Whiteboard: [MemShrink:P1] → [MemShrink:P1][inbound]
![]() |
Assignee | |
Comment 39•14 years ago
|
||
http://hg.mozilla.org/integration/mozilla-inbound/rev/1ec0339ac9a1
That is "patch 2" with some minor rebasing.
This should be the last patch for this bug, so when it's merged to m-c this bug can be marked as RESOLVED FIXED. Thanks.
Comment 40•14 years ago
|
||
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla7
Comment 41•14 years ago
|
||
ehr, still missing part 2 from this merge.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment 42•14 years ago
|
||
Status: REOPENED → RESOLVED
Closed: 14 years ago → 14 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•