The detection of abnormal sequences of SIP messages in real-time is crucial to avoid SIP signaling-based attacks. In this paper, we propose a deep learning approach to detect signaling patterns of multimedia sessions established with the Session Initiation Protocol (SIP). The approach is based on a recurrent neural network (RNN). We study the performance of different Long Short-term Memory (LSTM) RNN architectures, which are trained using a SIP signaling dataset of trustworthy SIP dialogs captured by a SIP server. The trained RNNs are then used to detect the SIP dialogs in real-time. After characterizing the dataset adopted for the training, validation, and testing, we present the experimental results obtained for the different RNN architectures, showing that the classification probability of trustworthy SIP dialogs exceeds 93% in the test stage. Finally, we present two methodologies to detect abnormal SIP dialogs, i.e., not contained in the trustworthy training dataset. After a detailed analysis of the skewness and kurtosis computed with the numerical RNN outputs, we show that they can be used as classification features. The first method is based on a K-means unsupervised classifier, while the second one is based on a semi-supervised threshold-based classifier. Experimental results show that the threshold-based classifier achieves 99.45% of detection probability, showing the effective utility of the proposed methodology to detect abnormal SIP sequences in a short period of time.