Issue
I am developing a GUI application (pyQt) where the user will add data that I want to store in a local database (sqlite). I am a beginner in this field, and even if the application is not connected or the stored data is not sensitive, I was curious to know how to block "SQL queries" that could come from the user. For example, the GUI allows the user to add a new specie's name, I implemented a controller that gets the text of the input, passes it to re.sub to remove any special characters, then I prepare a query, pass the clean data and execute. If the user input is something like "SELECT * FROM SPECIES" it works, user can't have acces to all data store in SPECIES table in the database, but I get "SELECTFROMSPECIES" as new data in my database, which is not a good specie's name. I'm thinking of implementing conditions in my clean_data function like "if 'SELECT' in entry: do something". Another problem is that if the user entry is a good specie's name but with some space as "Mus musculus" my clean_data function delete this spaces and store in database "Musmusculus". I'd like to know if there is a cleaner way to avoid these behaviours and properly parse the query to allow new user input while keeping a clean database. Thank you in advance for your suggestions.
Here my controler functions to clean and store data in db :
def clean_data(self, data):
clean_data = re.sub('[\W_]+', '', data)
return clean_data
def get_data(self, data):
clean_data = self.clean_data(data)
self.query.prepare("INSERT INTO SPECIES (specie_name) "
"VALUES (:s_name)")
self.query.bindValue(":s_name", str(clean_data))
self.query.exec()
Solution
I am not familiar with PyQt’s database handling, but the whole point of having separate prepare
and bindValue
calls is that it will sanitise the values for you. Any reputable database library will do so.
It proved surprisingly difficult to find PyQt documentation, but I did find a page on Real Python that says:
In PyQt, combining
.prepare()
,.bindValue()
, and.addBindValue()
fully protects you against SQL injection attacks …
This confirms what I said above.
ekhumoro, in a comment on the question, linked to documentation on placeholders.
On a related note, if you try to sanitise data using tricks like removing non-letter characters, you can expect trouble. Either you will miss something dangerous, or you will annoy users by removing perfectly safe text. The second problem might lead to more complex sanitising attempts, which will make the first problem more likely.
Here is more information, mostly in response to a comment:
- Role of placeholders. With placeholders, the user’s data is still directly added to the query. But it is not directly added to the SQL statement: it is appropriately quoted first, to prevent SQL injection attacks.
- Placeholder names. Placeholder names are arbitrary (subject to the limits of SQL syntax). It may or may not be appropriate for them to correspond to field names. They do not have to, but sometimes it makes it easier. In fact, in the linked examples, the placeholder names do correspond to field names.
- Placeholder names as keys. Placeholder names can indeed be considered keys, and do not need to be defined (indeed, cannot be defined) before preparing the query.
- Validation. Validation is a completely separate issue from SQL injection. The latter can be taken care of by a library, but the former is up to you, because only you know what is valid and what is not. (Once you have a clear idea of what is valid, you can write code to enforce this, and feel free to ask another question if you need help with that.)
Answered By - Brian Drake
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.