Bad Refactoring Suggestion on int.ToString()?
In cases where my code looks like this:
int i = 12345;
string msg = string.Format("i = ", i.ToString())
ReSharper suggests to "Remove redundant 'ToString'" and identifies "i.ToString" as a "Redundant "Object.ToString()" call".
If I am not mistaken, if I don't call ToString explicitly on the i variable, i will be boxed, and then ToString() will be called on the boxed integer. This is a performance issue, as the value type int/Int32 has a ToString that is optimized and doesn't require boxing.
In other words, if we have this:
object void Method(string arg1)
then this:
object o = Method(i) // involves a boxing operation on i
object o = Method(i.ToString()) // no boxing
If I am correct (and I believe I am) ReSharper's suggestion that the ToString() is redundant will introduce an unnecessary performance penalty if the suggestion is made for value types that already define their own ToString method, like Int32s do.
-Troy
Please sign in to leave a comment.
"Troy Hall" <no_reply@jetbrains.com> wrote in message
news:31609493.1173393756686.JavaMail.itn@is.intellij.net...
>
>
>
Do you actually have an application where there is a measurable performance
difference between these two?
John
Hello Troy,
In version 3.0 we will provide means to disable warnings on specific places
in code. In the current released version you can disable warning either through
the "Disable..." action in the lightbulb menu or in the options dialog.
As for the example, if your application is time critical you might probably
consider discarding string.Format method and building strings via string.Concat
or StringBuilder class. Another point is that if you use ToString method
on an integer you loose opportunity to provide custom presentation format
of the integer.
Best regards,
Andrey Simanovsky
Andrey Simanovsky (JetBrains) wrote:
Actually, it could be a good idea for ReSharper to suggest replacing
string.Format( ..., i) with the non-boxing string.Format( ...,
i.ToString()), provided that the integer is not used with special
formatting (and if it is, then maybe suggest using
i.ToString(theSpecialFormat) instead).
Sergey
+1
I'm absolutely sure that performance degradation due to boxing in string
format methods couldn't be seen in the most powerful microscope....
--
Eugene Pasynkov
Developer
JetBrains, Inc
http://www.jetbrains.com
"Develop with pleasure!"
"Troy Hall" <no_reply@jetbrains.com> wrote in message
news:31609493.1173393756686.JavaMail.itn@is.intellij.net...
>
>
>
>
>
>
Thanks for all your responses. I am a little taken aback by the number of people focusing on my use of String.Format() for the example rather than whether or not the refactoring suggests a boxing operation. But, that one example is not representative of all the places that I and that of millions of other developers have to convert an integer to a string. I am disappointed at the lack of imagination in the responses critical to my asking a question.
A boxing operation takes processor cycles unnecessarily, and also has the potential to stress the garbage collector unnecessarily. If a tool suggests a boxing operation where one is not needed, the tool is doing a disservice.
So,
A. Given method(string s), does method(i) introduce boxing, where method(i.ToString()) does not?
B. And doesn't boxing use more processing?
C. Shouldn't a tool not suggest refactorings that cause unnecessary performance degradation?
D. Therefore, is not the refactoring suggested in the original post ill-advised?
Correct me if I am wrong in any of the points above. Or, correct ReSharper. One of the two should happen.
"Troy Hall" <no_reply@jetbrains.com> wrote in message
news:12436060.1173456753989.JavaMail.itn@is.intellij.net...
>
>
>
You're wrong.
You are spending more time thinking about this than the amount of
performance that will be lost by all the boxing that will ever be done in
the entire career of most developers!
If you have some application where you can measure the performance
difference caused by boxing, then please tell us about this application. The
next question would then be, "how many developers will be developing similar
applications". If the answer is "one in a million", then JetBrains would be
well served to simply ignore the issue (politely). If the answer is "one in
a thousand", then maybe that's worth a bug report and the addition of a
setting in Options.
John
Hello Troy,
We appreciate your suggestion, but it does lack a compelling use-case where
keeping i.ToString() is important.
Answering your question A - yes, there is boxing operation.
Answering your question B - boxing itself does, the .ToString() call - not
necessarily:
Test case:
using System;
class C
{
public static void Main(string[] argv)
{
int x = int.Parse(argv[0]);
const int MAX = 2000000;
const int FIVE = 5;
for (int j = 0; j < FIVE; j++)
{
int count1 = Environment.TickCount;
for (int i = 0; i < MAX; i++)
string.Format("x = {0}", x);
count1 = Environment.TickCount - count1;
int count2 = Environment.TickCount;
for (int i = 0; i < MAX; i++)
string.Format("x = ", x.ToString()); count2 = Environment.TickCount - count2; int count3 = Environment.TickCount; for (int i = 0; i < MAX; i++) string.Concat("x = ", x.ToString()); count3 = Environment.TickCount - count3; Console.WriteLine("No ToString() - , with ToString() - , Concat
instead of Format - ", count1, count2, count3);
}
}
}
Ildasm shows boxing is in its place as all other code.
Typical output:
No ToString() - 1297, with ToString() - 1328, Concat instead of Format - 688
No ToString() - 1312, with ToString() - 1219, Concat instead of Format - 625
No ToString() - 1219, with ToString() - 1234, Concat instead of Format - 625
No ToString() - 1219, with ToString() - 1219, Concat instead of Format - 625
No ToString() - 1218, with ToString() - 1204, Concat instead of Format - 625
As you can see (even ignoring extreme cases) one cannot tell whether .ToString()
affects the actual performance or not.
I have no answer to question C. Probably, you are right.
The answer to question D is "no". It is a good advice to keep your code concise.
Best regards,
Andrey Simanovsky
Andrey,
As I wrote in my last e-mail, let's not focus on my poor example of using String.Format(). I refactored your example and came up with very compelling results - a reduction in execution time of 11-17% when using ToString().
Here's my code:
using System;
class C
{
public static void Main(string[] argv)
{
int x = int.Parse(argv[0]);
const int MAX = 2000000;
const int FIVE = 5;
for (int j = 0; j < FIVE; j++)
{
int count0 = Environment.TickCount;
for (int i = 0; i < MAX; i++)
string.Concat(string.Empty, x);
count0 = Environment.TickCount - count0;
int count1 = Environment.TickCount;
for (int i = 0; i < MAX; i++)
string.Concat(string.Empty, x.ToString());
count1 = Environment.TickCount - count1;
Console.WriteLine("Concat No ToString() - {0}, Concat ToString() - ", count0, count1);
}
}
}
Here's the results of several runs:
C:\...\ToStringBoxingTest\bin\Release>ToStringBoxingTest.exe 0
Concat No ToString() - 734, Concat ToString() - 610
Concat No ToString() - 734, Concat ToString() - 610
Concat No ToString() - 734, Concat ToString() - 609
Concat No ToString() - 735, Concat ToString() - 625
Concat No ToString() - 734, Concat ToString() - 609
C:\...\ToStringBoxingTest\bin\Release>ToStringBoxingTest.exe 32000
Concat No ToString() - 906, Concat ToString() - 782
Concat No ToString() - 922, Concat ToString() - 812
Concat No ToString() - 922, Concat ToString() - 797
Concat No ToString() - 922, Concat ToString() - 797
Concat No ToString() - 921, Concat ToString() - 797
C:\...\ToStringBoxingTest\bin\Release>ToStringBoxingTest.exe 2000000000
Concat No ToString() - 1219, Concat ToString() - 1062
Concat No ToString() - 1156, Concat ToString() - 1047
Concat No ToString() - 1172, Concat ToString() - 1031
Concat No ToString() - 1172, Concat ToString() - 1031
Concat No ToString() - 1188, Concat ToString() - 1062
Getting back to another comment on this thread, one would not use String.Format() in a code where performance was an important quality attribute. Using +/Concat() would be a more typical situation. The results speak for themselves, and this doesn't even take into account how boxing affects the GC, or potential heap implications.
I agree that in some situations that using ToString() doesn't buy you much. However, since either version of the code will compile and run and have the same results, a developer might not be aware that throught a simple change in habit they could eke out even better performance from their application. Using ToString() on value types where Strings are expected is probably a good idea, like using a turn signal in an automobile. It might not be necessary all the time, but it is a very good habit. If ReSharper encouraged this habit, or at least supported the conscious decision to use or not use ToString(), developers and code could benefit.
Sincerely,
Troy Hall
Hello Troy,
Yes, with string.Concat in place of string.Format there is a noticable difference
with .ToString and without. But ReSharper does NOT suggest dropping .ToString
in string.Concat. It does only in string.Format. Thus, so far ReSharper does
the right thing. Use string.Format (which more or less fairly means you do
not consider performance an issue at this very place) and it suggests dropping
.ToString. Use string.Concat and it does not.
The questionable item is operator "+". It is the same as string.Concat, but
it is generally used without performance considerations. ReSharper suggests
dropping .ToString() when "+" is used.
Best regards,
Andrey Simanovsky
Andrey,
Thanks for your continued patience. Adding ToString() to an integer that is an argument to String.Format may not buy performance, but it is certainly not "redundant". If you keep that refactoring, I suggest changing the message.
I did not realize that ReSharper does not suggest the ToString() refactoring for String.Concat(). Every knowledgeable C# developer knows that the compiler will generate String.Contact() in the IL when it runs across "+" on strings, and therefore most good C# coders don't use String.Concat(). (Well, some uber-geeky types that like writing obtuse code might use it, but they probably don't play nice with the other coders any way.)
Because the "" operator means one thing with strings and another with integers and other numeric types, experienced developers take the safe route and convert numeric types to strings when they are an operand in an expression that includes "" meant for strings. This practice guards against many problems with expressions that mix string literals and variable values, for example, changing the datatype of a variable, rearranging operands, and changing expression grouping by adding or removing parentheses.
Similarly, many seasoned developers always write boolean equality expressions with the constant on the left of the equality sign, e.g. if (5 == iterations) { // }. This practice guards against the developer introducing a mistake in leaving off one of the equals signs, turning the expression into an assignment that still runs without error. It isn't necessary to do this. It's just a good idea.
So, in the end, having ReSharper suggest the ToString is "redundant" for String.Concat but not "" is simply silly. Having ReSharper suggest ToString is redundant in concatenation expressions using "" is dangerous. At best, suggesting the removal of ToString() on integers might be "potentially easier to read" but is never, ever "redundant", as (string)i does NOT do the same thing as i.ToString().
As many readers have pointed out, the performance advantages of using ToString may be vanishingly small in most practical usage scenarios. However, that does not make it redundant, and it does not justify ReSharper in effect telling developers that their use of it is wrong. If I write "i.ToString()", I MEAN "i.ToString()"!!! I deliberately and intentionally write "i.ToString()" because I know what it means and what I am trying to do. And, who in their right mind would type a call to ToString() unless they had a specific reason for doing so??? It is arrogant to call it a mistake, and insulting to suggest that an experienced developer doesn't know what they are doing.
-Troy
This conversation got me curious because I believed the ToString() option was correct (faster and used less memory).
The previous example code didn't try at all to determine costs of garbage collection or memory usage.
To mitigate my concerns, I applied liberal sprinklings of GC.Collect() and GC.GetTotalMemory(bool). Plus I used QueryPerformanceCounter() for better resolution for the timing loops (hoping to get rid of other system process interference by making the inner loop short enough). To top it all off, I threw out results more than one standard deviation from the average (I figured other system processes or garbage collection interfered with those results). I also tried for "short" and "long" value types to see the difference.
I found that the "ToString()" option is around 2-8% faster on my machine but, surprisingly uses about one more byte per "string.Format()" call than the non-ToString() option. Strangely, reducing the number of trials increases the memory usage/Format(). Using a long instead of an int causes the non-auto-boxing version to uses about 1/2 byte less memory (shorts and ints take the same amount of memory).
Code attached.
Attachment(s):
AutoBoxingTest.cs
Troy, as an aside, I've used that practice for a long time (a carry over from C++ days), but someone recently challenged me on it and I have yet to come up with an example where the C# compiler would mistakenly accept an assignment in place of comparison. Do you have one?
-Michael