Hello,
This post is not a technical problem with igHtmlEditor but rather a request for best practice suggestions.
My MVC project uses igHtmlEditor to allow the user to enter content, which is then stored in a database. Later, this content will be retrieved for further editing.
To protect against scripting attacks, I am using Microsoft AntiXss. Currently I store the user content in the database as-entered and sanitize the content prior to sending it back to the browser.
Here is some html that was created in the igHtmlEditor and then saved to the database. This is exactly as expected.
<span style="font-family: Arial; font-weight: bold;">This is some bold text using the arial font.</span><div><span style="font-family: 'Comic Sans MS';">This is regular text using Comic Sans.</span></div><div><span style="font-family: 'Comic Sans MS';"><br></span><div><br></div></div>
After retrieving the content and sanitizing it, here is the sanitized html:
<span>This is some bold text using the arial font.</span><span>This is regular text using Comic Sans.</span><span></span>
Obviously this is not what is needed.
So I am wondering if anyone has suggestions for how to successfully send html back to the igHtmlEditor while at the same time protecting against a scripting attack.
Thank You!
Randy
hey Randy,
You can use the following API to encode the string from the HTML Editor :
string output = AntiXss.HtmlEncode(<the string>);
where "the string" is :
<span style=\"font-family: Arial; font-weight: bold;\">This is some bold text using the arial font.</span><div><span style=\"font-family: 'Comic Sans MS';\">This is regular text using Comic Sans.</span></div><div><span style=\"font-family: 'Comic Sans MS';\"><br></span><div><br></div></div>
This will result in the following encoded output:
"<span style="font-family: Arial; font-weight: bold;">This is some bold text using the arial font.</span><div><span style="font-family: 'Comic Sans MS';">This is regular text using Comic Sans.</span></div><div><span style="font-family: 'Comic Sans MS';"><br></span><div><br></div></div>";
You can then store this value in the db, as you've described. In order to load it back to the editor, you can use the following code (note that i am applying this code on click of a button, but it can be placed anywhere in your code after the HTML editor has been initialized):
$("#load").bind("click", function () {
var htmlString = "<span style="font-family: Arial; font-weight: bold;">This is some bold text using the arial font.</span><div><span style="font-family: 'Comic Sans MS';">This is regular text using Comic Sans.</span></div><div><span style="font-family: 'Comic Sans MS';"><br></span><div><br></div></div>";
$("#testEditor").igHtmlEditor("setContent", $('<div/>').html(htmlString).text(), "html");
});
And here is the result :
Hope it helps. Let me know if there is anything else i can assist with. Thanks,
Angel
Thank You Angel! This does work.
Consider the situation where the user enters and saves some text such as:
<script>alert('hi!');</script>
I am going to attempt to attach a screen shot of the result, but I think you get the idea.
So this solution displays the saved html correctly but does not protect against an XSS attack. Please correct me if I am wrong about this.
My application is in development currently but when we go to production I don't think we can approach the problem this way.
What are your thoughts?
Thanks again,
Hey Randy,
Ok, let me know once you have more questions. will be glad to help.
Thanks,
Thank You, Angel. I will look into your suggestions and let you know what I learn.
In the meantime, I thought you would want to know that the AntiXss.HtmlEncode function has been deprecated (I know because Visual Studio told me...) and the preferred syntax is now Encoder.HtmlEncode -- still using the Microsoft.Security.Application namespace. AntiXss.GetSafeHtmlFragment is still correct.
Thanks again and I will be in touch.
Thanks, I haven't noticed that. You're right - I think the Santizier shouldn't be doing that at all. It looks like it isn't mature enough to be used even for testing purposes, you can refer to the following thread - a lot of guys are complaining about it stripping valid markup:
http://wpl.codeplex.com/workitem/17246
Therefore I suggest to use some different library, or use an older version of the AntiXss library (which I wouldn't recommend), or strip the script tags by yourself.
you can take a look at those two links for list of tags and scenarios that need to be considered when stripping contents:
http://msdn.microsoft.com/en-us/library/ff649310.aspx#paght000004_step4
https://www.owasp.org/index.php/Cross-site_Scripting_%28XSS%29
You can also take a look at this post, the author has created a solution targeting the AntiXss issues:
http://eksith.wordpress.com/2012/02/13/antixss-4-2-breaks-everything/
If in need of a more custom and involved solution, my suggestion is to use the HTML Agility pack and create a white list, instead of relying on regular expressions:
http://htmlagilitypack.codeplex.com/
Hope it helps. Thanks,
Hi Angel,
Thank You for your post.
I did try using Sanitizer, the problem is that it strips out the formatting on valid content. For example if you consider our original html:
<div><span style="font-family: Arial; font-weight: bold;">This is Arial Bold</span></div><div><span style="font-family: 'Comic Sans MS';">This is Comic Sans Regular</span></div>
After running that string through GetSafeHtmlFragment the result is:
<span>This is Arial Bold</span><span>This is Comic Sans Regular</span>
Obviously that will not display as needed when that string is pulled from the DB and sent back to the igHtmlEditor.
In a production environment I think we have to assume that data could be malicious and sanitize it. So at this point I can make an html string that displays properly in igHtmlEditor, or an html string that is free of potential security problems, but not both.
Thanks again for your suggestion. Do you have any other ideas about this?
that's a great point ! For those type of malicious scripting attacks, you can use the Sanitizer class which is in the Microsoft.Security.Application namespace (again part of AntiXss). It has a couple of useful methods like GetSafeHtml or GetSafeHtmlFragment, so doing something like this:
string safeOutput = Sanitizer.GetSafeHtmlFragment("<script>alert('hi!');</script><span>safe</span>");
will result in: <span>safe</span>.