Couple of questions

Apr 25, 2013 at 2:26 PM
Hi,

I have a huge project that needs to be localized and I tested move to batch resource, and while there's a lot of well-matched strings that don't need to be included I still have a lot of false positives.

Is is possible to always ignore the following statements:

variable.Name = "<string>"
variable.Font = new System.Drawing.Font("<string>", ...)


Is it also possible to add strings to an ignore list (specific strings that never have to be refactored)?


I also have a lot of MessageBox.Show("<string>" + variable + "<string> + variable + "<string>"); statements. Will it be possible to refactor this to MessageBox.Show(string.Format(resourceEntry, variable, variable); where resourceEntry is of the format: "<string> {0} <string> {1} <string>


There's also a lot of strings that are passed into property indexers, like the key of a dictionary or the fieldnames in a datarow. Is it possible to be able to ignore all strings that are passed into property indexers?

I will have a look if I could add those things myself by downloading the source code but it might be nice to have it as standard functionality.

Thanks in advance
Coordinator
Apr 25, 2013 at 2:46 PM
Hi drake7707,

thank you for using Visual Localizer.
I will start with a piece of advice - have a look to the Visual Localizer settings page (Tools/Options/Visual Localizer). In the "Filter ToolWindows" node, you will see a set of what I call "localization criteria". These criteria are used to calculate the localization probability (the percentile value displayed in the tool window grid). You can customize these any way you want.

Now, to your questions. The trouble is, Visual Localizer does not actually parse the code, it merely looks for string literals based on the occurences of ". It is therefore not able to detect situations like variable.Name = "<string>" - it only sees the string, not the stuff around. In the settings page, you can say that strings from certain methods, classes or namespaces should be eliminated; but you can't make decisions based on the code around. For the same reason, you can't simply eliminate values passed as indexers.
Is it also possible to add strings to an ignore list (specific strings that never have to be refactored)? ,
Yes, using the Visual Localizer settings page.
I also have a lot of MessageBox.Show("<string>" + variable + "<string> + variable + "<string>"); statements.
This is not yet possible, however as I think about it, it would not be so difficult to code it. The only trouble is how to detect such situations. Since the deadline of this project approaches (it's my Bachelor Thesis), I am not sure whether I will implement it right away - I will give it a try and let you know how things are going.

Thank you again for your inquiry, I hope you enjoy Visual Localizer.
Apr 25, 2013 at 3:15 PM
Thanks for your quick reply.


If it would be possible to add a filter on the line from the Context column (where RESOURCE REFERENCE is inserted) I think i would be able to filter out most of the issues I have. If I could add a criteria like

Criteria predicate: [Line] [matches]
Regular expression: ".+.Name ="
Action: force not localize

It would probably solve both of my questions above, but I'm not sure how feasible it would be to add this.
The only trouble is how to detect such situations.
Imo the best way would be to detect all the strings on a single statement which would probably require a c# tokenizer and building an AST tree, which would be a lot of work if there isn't an existing implementation available. As simpler way would be once the first string is encountered, to keep reading until you encounter the end of the statement ; (thinking in C# here), then split on the + outside the strings (and determine which parts are strings by checking if they are surrounded with quotes). This would probably work for simple examples like i gave above but would give a lot of problems if parenthesis are mixed in.
Coordinator
Apr 25, 2013 at 3:29 PM
If the criterion matching line helps, I will implement it right away.
Regarding the string concatenation - I have something like this already implemented, but for completely different purpose. In VB .NET, there is no way of inserting a control character into a string literal. In C#, there are escape sequences (and Visual Localizer can parse them). VB developers therefore need to use this:

Dim s As String = "first line" + vbNewLine + "second line"

The 'vbNewLine ' is constant. Visual Localizer is able to concatenate the string literals and insert CR+LF between them. Whenever a string literal occurs, the space between this one and the previous one is checked and matched against several pre-defined regular expressions. If a match is found, the strings are merged.

The idea with semicolon is simple and elegant; however it works only in C# (VB statements are terminated by newline) and I want to provide consistent support for all languages. Since I used regular expressions to solve the VB issue, maybe it would be enough merge the strings if the space between them matched "something.somewhere.name". It would of course fail in a lot of cases...
Coordinator
Apr 25, 2013 at 3:53 PM
I updated the release, the predicate "line" is available.
It matches the line where RESOURCE REFERENCE is inserted (including the RESOURCE REFERENCE text as well). Note that if your code contains something like this:
object.Name = "foo"; object2.TextForUser = "bar";
the "bar" text will NOT be localized, because the line would still match the ".+.Name =" regexp.
Apr 26, 2013 at 7:17 AM
Excellent, thank you.
In VB .NET, there is no way of inserting a control character into a string literal.
The only control character I know of is by doubling the quotes to insert a single quote like: "this is a quote: ""."
Dim s As String = "first line" + vbNewLine + "second line"
Correct, but there's also vbCrLf from the Microsoft.VisualBasic namespace to bear in mind that's used a lot (as it was vbCrLf in vb6), with & as string concatenation rather than + (double + string in VB.NET will attempt to parse the string to a double, while double & string will convert the double to a string)

I created a simple proof of concept for converting strings to string.Format in c#: https://dl.dropboxusercontent.com/u/8797691/SimpleStringTransform.zip
While it works for the given example, there are still issues with it. For example, if you tried to parse the output again it will try to nest the string.Format statements, it doesn't take @"<string>" in account (so string statements on multiple lines will break) and I haven't tested it at all with lambda statements. The best way would probably be to use the tokenizer to be more content aware but it will still be a lot of fiddling around with it.
the "bar" text will NOT be localized, because the line would still match the ".+.Name =" regexp.
That's ok, I don't think there are multiple statements on a single line in our codebase, and definately not like that, but it's worth to keep in mind.
Coordinator
Apr 26, 2013 at 9:37 AM
Edited Apr 26, 2013 at 9:38 AM
Wow, that's a lot of well-documented code :-)

First of all, Visual Localizer is aware of vbVerticalTab, vbCrLf, vbCr, vbLf, vbNewLine, vbNullString, vbTab, vbBack, vbFormFeed, values of the ControlChars class and the Chr() and ChrW() functions. And the double-quote escape as well of course.

Most of the troubles you mentioned I have already dealt with and/or are eliminated by my approach (plus, you did not mention comments). What I need is something that would take the text between two string literals (arbitrarily distant) and say "YES, these can be merged using string.Format() - here are the arguments".
When solving the VB issues, I simply use set of regular expressions. I think they would probably work here as well.

P.S. When using the "line" predicate, don't forget the line contains " RESOURCE REFERENCE ", including the double-asterisks (which must be escaped in the regexp). No space around the asterisks, unfortunately this forum interprets them as a formatting characters :-).