Back in 2017 I had a hobby project called Car Cards.
As a kid I loved this card game: you distributed the pack of cards equally among the players (2–4), they you simply stacked them and viewed only one. The card contained an image of a car car and some data about it (acceleration, maximum speed, consumption, etc). You picked one data, said it out loud and the higher (speed, engine) or lower (acceleration, consumption) won the round and collected the cards from the others. A few years ago I thought it would be great to give it a twist and play it with my own car against other and pack it as a mobile application. Finally I had the time to create code it and release the Android version in the store.
To be honest the application was a not big hit, but I managed to get a few hundreds of downloads by posting its link in Android Facebook groups. And then came the trolls. The trolls, who did not fill proper data of their car and uploaded crazy images (not even of cars).
For data I implemented a check that would filter out unbelievable numbers (by defining max and min), but the stupid images still remained (of gardens, kids, boats, whatever). Images were problematic because they can be disturbing for the opponent user, but more importantly there was a PEEK CARD feature with witch you are able to take a peek of the opponent car’s image and you can make your data choice based on that (eg. if you have an Audi A8 3.0 and the opponent car is a Peugeot 206 then you should not choose consumption — however there can be surprises).
I realized it would be great to defect what is on the image, but I did not have time to create my own neural network for that. Then I came across Clarifai.
It is a great online tool with an API which help you detect the content of an image. You can give it a try with any image you like. Just upload it and you receive the list of predictions.
This gem I found saved me a lot of effort, since this is exactly the tool I was looking for: when creating or modifying a profile I can easily check whether the uploaded image is of a car or not. Clarifai has a very straightforward sample project on Github for Android and iOS as well, which makes implementation even faster. Right now I can only show you what I did on Android.
Basically what you have to do is:
- Copy the gradle dependency: “com.clarifai.clarifai-api2:core:2.2.+”
- Request an API key for Clarifai.
- Then create an API key in string.xml.
- Finally copy the necessary parts from App.java and RecognizeConceptsActivity.java (this contains the async call to Clarifai).
- If not added already you will need internet permission and most probably access to camera and / or storage in the Manifest and / or runtime as well.
I also down-scaled the image before posting it to Clarifai to 600px (longer side) width not to consume too much cell data. And there you go.
What you do with the response is up to you. The sample Android project just lists the predictions, but I checked whether it contains “name=car” anywhere. For me it was enough to kick-off this new concept, but you can go into details and check whether a prediction is above a percentage.
And now comes the best part: Clarifai is free until 5000 calls / month, so it should be enough for most of the hobby projects or to try your concept about image recognition. If you need more just check out the pricing.
Another useful tip: you should not handle the error path for Clarifai (no response, empty response, timeout, etc.), because UX is king! Just either silently check the image or — what I implemented — show a progress dialog (“Checking image…”) and have an informative alert dialog for the case when you definitely see a problem with the image (in Car Cards: there was no sign of any car). But in case of any error I just let it go and try not to irritate the user. Another important note is that you should not block any flow because of the image validation. By this I mean that you should only give a warning but never block the registration process just because of the response you get from the Clarifai API. It can be down, it can have false prediction, etc.
This was just a quick example but if you think about it, there can be much most use-cases:
- Real estates app can check whether images are really about houses / apartments.
- Car dealerships can do the same with cars.
- If you want profile images of people only.
- Filter profiles created by children (of course, this will not be 100%)
- Filter images that contain nudity (tested :))
- … or any other app which need images of a certain type: food, animal, boat, garden, whatever :)