Media Analysis

Sometimes, in order to bypass maximum message length restrictions on social media sites, users post screenshots and images with information. Applying Optical Character Recognition allows NTerminal to convert those images into text before handing it over to other modules for sentiment analysis, named entity recognitions, etc.

Keyword detection is also possible in user media, like images or video previews. By leveraging neural network-based machine vision technologies, we automatically generate descriptions for that media and feed it through our standard NLP pipeline.

Text Recognition Example

{
    "time": "2018-07-06T07:46:31Z",
    "event_source": "Twitter",
    "source_category": "",
    "source_subcategory": "",
    "event_type": "test",
    "label": "Alex Biryukov",
    "category": "person",
    "subcategory": "",
    "description": "",
    "match": "Alex Biryukov",
    "match_pos": "media text",
    "context": "Cryptanalysis, Reverse-Engineering and Design of Symmetric Cryptographic Algorithms Léo Perrin CSC & SnT, University of Luxembourg CryptoLUX Team; supervised by Alex Biryukov July 5th 2018 Luxembourg National Research Fund UNIVERSITE DU LUXEMBOURG Favitè des Sciences uni.I uni.Iu UNIVERSITÉ DU LUXEMBOURG",
    "title": "Some photos from Leo Perrin's Ph.D. thesis award ceremony which was yesterday. @FnrLux @uni_lu @SnT_uni_lu https://t.co/IhdIOGxYqw",
    "document_url": "https://twitter.com/alexcryptan/status/1015139919705116672",
    "stats": {
        "lines": 1,
        "pages": 1,
        "size": "134 bytes",
        "words": 20
    }
    "person_name_candidates": [],
    "organization_name_candidates": [],
    "location_name_candidates": [],
    "related_documents": [],
    "links": [],
    "hashtags": [],
    "user_mentions": [
        {
            "id": "3032332497",
            "link": "https://twitter.com/FnrLux",
            "name": "FNR Luxembourg",
            "screen_name": "FnrLux"
        },
       {
           "id": "88457086",
            "link": "https://twitter.com/uni_lu",
            "name": "uni.lu",
            "screen_name": "uni_lu"
        },
       {
           "id": "846698239680303106",
            "link": "https://twitter.com/SnT_uni_lu",
            "name": "SnT",
            "screen_name": "SnT_uni_lu"
        }
    ],
    "author": {
        "id": "2659189136",
        "link": "https://twitter.com/alexcryptan",
        "name": "Alex Biryukov",
        "screen_name": "alexcryptan"
    },
    "media": [
        {
            "labels": [
                "presentation",
                "public speaking",
                "seminar",
                "lecture",
                "academic conference",
                "communication",
                "orator",
                "meeting",
                "technology",
                "human behavior"
            ],
            "text": "Cryptanalysis, Reverse-Engineering and Design of Symmetric Cryptographic Algorithms Léo Perrin CSC & SnT, University of Luxembourg CryptoLUX Team; supervised by Alex Biryukov July 5th 2018 Luxembourg National Research Fund UNIVERSITE DU LUXEMBOURG Favitè des Sciences uni.I uni.Iu UNIVERSITÉ DU LUXEMBOURG",
            "type": "photo",
            "url": "http://pbs.twimg.com/media/DhZ_snSX0AASEk2.jpg",
            "person_name_candidates": [
                "Léo Perrin",
                "Alex Biryukov"
            ],
            "organization_name_candidates": [
                "University of Luxembourg",
                "CryptoLUX Team",
                "Luxembourg National Research Fund",
                "UNIVERSITE DU LUXEMBOURG"
            ],
            "location_name_candidates": [
                "Luxembourg",
                "UNIVERSITE DU LUXEMBOURG"
            ],
            "sentiment": {
                "google": {
                    "document": 0.20000000298023224,
                    "sentence": 0.20000000298023224,
                    "phrase": 0.0
                },
                "ibm": {
                    "document": 0.0,
                    "term": 0.0,
                }
            }
        }
    ],
    "sentiment": {
        "google": {
            "document": 0.0,
            "sentence": -0.10000000149011612,
            "phrase": 0.0
        },
        "ibm": {
            "document": 0.0,
            "term": 0.0,
        }
    }
}

Source: https://twitter.com/alexcryptan/status/1015139919705116672

Video Description Generation Example

{
    "time": "2018-09-04T11:34:48Z",
    "event_source": "Twitter",
    "source_category": "",
    "event_type": "keyword",
    "keyword_label": "Alpaca",
    "keyword_category": "test",
    "keyword_subcategory": "",
    "keyword_description": "Mascot",
    "match": "alpaca",
    "match_pos": "media labels",
    "context": "",
    "title": "Baby Alpacas are so under appreciated. https://t.co/0lRb2XR3kn",
    "document_url": "https://twitter.com/AMAZlNGNATURE/status/1036940640427245569",
    "stats": {
        "lines": 1,
        "pages": 1,
        "size": "62 bytes",
        "words": 7
    },
    "person_name_candidates": [],
    "organization_name_candidates": [],
    "location_name_candidates": [],
    "related_documents": [],
    "links": [],
    "hashtags": [],
    "user_mentions": [],
    "author": {
        "id": "2828212668",
        "link": "https://twitter.com/AMAZlNGNATURE",
        "name": "Nature is Amazing 🐧",
        "screen_name": "AMAZlNGNATURE"
    },
    "media": [
        {
            "labels": [
                "llama",
                "camel like mammal",
                "alpaca",
                "terrestrial animal",
                "fauna",
                "vicuña",
                "livestock",
                "grass",
                "pasture",
                "lamb and mutton"
            ],
            "text": "",
            "type": "photo",
            "url": "http://pbs.twimg.com/ext_tw_video_thumb/994652754118299648/pu/img/561vdQeiKxFTZ1Fn.jpg"
        }
    ],
    "sentiment": {
        "google": {
            "document":  0.20000000298023224,
            "sentence": 0.4000000059604645,
            "phrase": 0.5
        },
        "ibm": {
            "document": 0.798356,
            "term": 0.761833,
        }
    }
}

Source: https://twitter.com/AMAZlNGNATURE/status/1036940640427245569