Jealousy, a complicated feeling, affects our emotional well-being and social contacts when we believe our relationships are in jeopardy (Ahlen et al., 2023; Kaufman-Parks et al., 2023). People of different ages, orientations, cultures, and relationship types feel this basic aspect of romantic relationships (Bernhard, 1986; DeSteno et al., 2006). Often related to the need for uniqueness, jealousy is closely related to accusations of infidelity (Echeburúa & Fernández-Montalvo, 2001). White (1981) defines jealousy as a mix of thoughts, emotions, and actions triggered by perceived threats to self-worth or relationship strength due to a partner’s interest in someone else. It appears intellectually, through doubts and anxieties; emotionally, through feelings such as anger and envy; and behaviourally, through acts performed to confirm these suspicions (Cuesta, 2006; Echeburúa et al., 2009; Guerrero et al., 2005; Pfeiffer & Wong, 1989). Jealousy is also significantly linked to intimate partner violence; rates in Latin America range from 20% to 50% (Villagrán et al., 2023).
Also, it has been observed that expressions of jealousy vary between genders, reflecting differences in the perception of infidelity and sexual and emotional competition (Kyegombe et al., 2022; Larsen et al., 2021; Toplu-Demirtaş et al., 2022). A meta-analysis by Pollet and Saxton (2020) showed that women are more sensitive to the attractiveness of rivals compared to men and tend to experience more emotional jealousy (Valentova et al., 2020; Zandbergen & Brown, 2015), while men exhibit more sexual jealousy (Edlund et al., 2019). Studies with non-human species also indicate that males and females differ in their reactions to jealousy, even at neurological or biological levels. For example, male titi monkeys show an increase in testosterone and cortisol, as well as lip-licking behaviours, when they see a female near another male, while females exhibit less visible reactions (Zablocki-Thomas et al., 2023).
The study of jealousy is crucial due to its negative impact on relationship dynamics and psychological well-being (Bringle & Buunk, 2021). Jealousy is linked to poor communication skills and alcohol use, which can mediate the relationship between jealousy and partner violence (Pichon et al., 2020). It is also associated with distrust, leading to obsessions and increasing the risk of depression, anxiety, and violence (Brandes et al., 2020). In fact, in Latin America it is known to be associated with machismo, mistrust and infidelity (Ariza et al., 2022). These adverse effects can escalate to serious situations, with 9% of partner homicides followed by suicide involving jealousy, more frequently in males who struggle to control their emotions and react violently to perceived threats to their masculinity (Johnson, 2024). This underscores the need for a gender-focused approach to addressing jealousy (Aloyce et al., 2024; Colasanti et al., 2023).
Understanding jealousy presents considerable challenges, partly due to the limited availability of specialised tools for its assessment. For example, one of the limited options is the Multidimensional Jealousy Scale (MJS), which assesses jealousy in three dimensions (cognitive, emotional, and behavioural) through 15 items, but it is only validated in Italian (Diotaiuti et al., 2022). Similar studies examining metric goodness and measurement invariance, such as the one by Diotaiuti et al. (2021), demonstrate the importance of validating psychometric tools in specific cultural contexts. Additionally, there is the Scale of Pathological Jealousy (CECLA), composed of 19 items that examine three types of jealousy: passionate, obsessive, and delusional (Prieto & Montesinos, 2021). Despite this, the MJS is a generic scale that uses a standardised classification to understand a construct such as cognitive, emotional, and behavioural aspects of jealousy, while the CECLA is an instrument that measures jealousy from a pathological perspective; that is, jealousy disorder.
In this context, the Brief Jealousy Scale (BJS; Ventura-León et al., 2018) emerges as an instrument that allows measuring jealousy from a non-pathological perspective, but rather as a natural and human response. The BJS is a subscale of the Inventory of Emotional Communication in Romantic Relationships (Sánchez, 2012). The items have been developed based on manifestations of jealousy previously identified in the literature (Cuesta, 2006; Echeburúa et al., 2009; Guerrero et al., 2005; Pfeiffer & Wong, 1989). Through preliminary validation, the instrument has demonstrated adequate psychometric properties with a unidimensional model (CFI > .95; RMSEA < .08) and solid reliability reflected by the omega coefficient (w =.88).
Given the relevance of jealousy in the dynamics of relationships and its potential negative impact on mental health and gender equality, it is crucial to have valid and reliable tools to measure them and understand their complexity. This would allow the reduction of violent behaviours, contributing to gender equality and promoting greater psychological well-being both at the individual and couple levels (Buller et al., 2023; Kyegombe et al., 2022). The availability of an instrument with an adequate level of validity and reliability would provide a solid foundation for deeper studies on romantic jealousy and its interaction with other factors. This would enable evidence-based interventions with a gender focus, reducing violence and promoting healthy behaviours in couples.
The jealousy scale (BJS) has been validated in Peru using Confirmatory Factor Analysis (CFA). This study introduces Item Response Theory (IRT), a superior methodology for evaluating all levels of jealousy intensity, providing accurate assessment through scale information functions, which measure the trait’s severity effectively (Bean & Bowen, 2021; DeVellis, 2006; Whittaker & Worthington, 2016). IRT ensures the internal invariance of item parameters such as difficulty and discrimination, allowing valid comparisons within the population assessed (Asún et al., 2017). Additionally, IRT provides conditional reliability, assessing measurement consistency at various points of the latent trait (Bean & Bowen, 2021; Whittaker & Worthington, 2016). Despite its advantages, IRT has not been applied to the BJS in Peru or among young and adult individuals in ongoing romantic relationships. This lack of psychometric validation may limit the scale’s effectiveness in clinical interventions. Therefore, validating the BJS with IRT in diverse contexts is crucial to ensure its relevance and accuracy amid sociocultural changes (Fairchild et al., 2005; Gjersing et al., 2010).
Considering the above, this study examines the psychometric properties (validity and reliability) of the BJS, which provides a basis for initiating studies that explore the relationship between jealousy and other variables of relationship dynamics. Despite the importance of jealousy, there are few instruments that measure this concept at a Latin American level. Thus, there is a need to understand the development of romantic relationships more fully (Neemann et al., 1995).
Method
Participants
Initially, a total of 315 observations were collected. However, after identifying unusual patterns using the Zh index, which is a standardised person-fit statistic that helps identify “outlier” response patterns in survey data even when total scores appear typical (Drasgow et al., 1985) and taking into account Zh values ± 2.0 that indicate significant deviations from the expected pattern, 18 participants were excluded from the sample to ensure the integrity and accuracy of the data analysis (Felt et al., 2017).
This resulted in a final set of 297 records. The mean age was 26.52 years (SD = 7.75), indicating a varied representation of early adulthood. Gender composition indicated a female prevalence of 74.10% compared to 25.90% male. In terms of sexual orientation, a predominant 89.90% identified as heterosexual, followed by 8.42% who identified as bisexual, and a minimum of 1.68% as homosexual, thus showing diversity within the studied population. Regarding the type of affective bond, a higher incidence of “In Love” individuals was observed at 59.30%, followed by “Dating” at 12.10%, “Married” representing 11.10%, “Living Together” at 9.43%, and “Dating Around” at 8.08%; where the average time in the relationship was around 42 months. Most participants were from Lima (78.50%), compared to 21.50% from outside Lima, indicating a strong urban presence in the study. All participants in the sample had a middle socioeconomic status and a university education level.
Instruments
Sociodemographic Data Sheet. In this research, a specially designed sheet was used to collect detailed data on the personal characteristics of the participants. This included information such as age, gender, sexual orientation, type of romantic relationship, and duration of said relationship.
The Brief Jealousy Scale (BJS; Ventura-León et al., 2018) is part of one dimension of the Inventory of Emotional Communication in Romantic Relationships (Sánchez, 2012). The BJS includes nine items evaluated using a Likert scale ranging from 1 (Not jealous at all) to 5 (Very jealous), measuring various situations that can cause jealousy in an individual. For example, some items present situations such as: if my partner spends much more time with someone else, I would feel jealousy or if I feel that my partner trusts someone else more than me, I would feel jealousy (See Appendix). The validity of the scale was confirmed through confirmatory factor analysis, showing acceptable fit (CFI = .97; SRMR = .03; RMSEA = .08). Additionally, reliability was determined using the omega coefficient (w = .88), indicating good internal consistency (Mafla et al., 2019).
Procedures
The project received approval from the Ethics Committee of the Universidad Privada del Norte, identified by code 0010-2024/ID-CIEI, and adhered to the guidelines of the Declaration of Helsinki (World Medical Association, 1964). Participants provided informed consent before inclusion. Questionnaires were administered virtually, following online research norms by Hoerger and Currell (2012), and shared via WhatsApp and Facebook. Participants took an average of 18 minutes to complete the questionnaires. Data collection occurred from September to December 2023, with results archived in the OSF repository: https://osf.io/j9bvu/
Data analysis
Statistical analyses were conducted using the R programming language within the RStudio environment. Various libraries were employed such as ‘mirt’ (Chalmers, 2012), ‘ggplot2’ (Wickham et al., 2020), ‘tidyverse’(Wickham, 2019), ‘semPlot’ (Epskamp, 2015), ‘jrt’ (Myszkowski, 2021), and ‘IRTools’ (Ventura-León, 2024) for data organisation.
Descriptive statistics were first calculated using response rates due to the ordinal nature of the variables. Item Response Theory (IRT) was then applied, complementing Classical Test Theory (CTT). IRT determines item parameters and provides an information function to evaluate test accuracy at different trait levels, rather than overall reliability (Zickar & Broadfoot, 2009).
A Graded Response Model (GRM) by Samejima (1997) showed better performance than PCM and GPCM, indicated by a lower Bayesian Information Criterion (BIC; Schwarz, 1978), which is more accurate for polytomous IRT models (Kang et al., 2009). Assumptions were reviewed before applying IRT: (a) local independence with Q3* statistic (values below 0.20); (b) monotonicity through the category characteristic curve (Christensen et al., 2017). The Zh index (threshold ± 2.0) identified aberrant response patterns (Felt et al., 2017). Reviewing outliers is crucial as they can affect model estimation (Yuan & Zhong, 2013).
Item Response Theory (IRT) was applied using a two-parameter model (2PL). The discrimination parameter (a) measures the test’s ability to differentiate between individuals with high and low ability (q), with values greater than 1 indicating high discrimination. The location parameter (b) indicates where on the q scale a person is likely to choose between responses. The MCEM (Monte Carlo Expectation-Maximisation) algorithm was used for estimation.
Fit was assessed globally using log-likelihood, Comparative Fit Index (CFI ≥ .95), Tucker-Lewis Index (TLI ≥ .95), and Root Mean Square Error of Approximation (RMSEA ≤ .05; Maydeu-Olivares, 2013). Locally, items were assessed using RMSEA, with acceptable ranges between .05/(k-1) and .089 (Maydeu-Olivares & Joe, 2014). Generalised S-c2 was not used due to its sensitivity to sample size and requirement for random sampling (Hirschauer et al., 2020; Lin et al., 2013).
Reliability was calculated using the test information function and empirical reliability (rxx), which considers factor scores and model estimates (Du Toit, 2003). The `empirical_rxx()` function computes the variance of ability (q) estimates for a sample (N), dividing this variance by the sum of it and the square of the average standard error (Liu & Chalmers, 2018). This “true score / (true score + error)” approach estimates the consistency of observed scores in relation to true latent abilities (Seo & Jung, 2018).
Differential item functioning (DIF) by gender was examined using the Compensatory Differential Response Function (dDRF), a robust and standardised method recommended by Chalmers (2018). The dDRF quantifies differences in item functioning between groups in standard deviation units, considering the magnitude and direction of bias (Kleinman & Teresi, 2016). This allows for a clear interpretation, like Cohen’s d, with 0.2 indicating a small effect, 0.5 a moderate effect, and 0.8 or higher a significant difference (Chalmers et al., 2016; Cohen, 1988). Due to data imbalance, the SMOTE algorithm was used for minority oversampling, showing good performance with imbalanced and categorical data (Islahulhaq & Ratih, 2021; Wongvorachan et al., 2023).
Results
Preliminary analysis
Figure 1 shows response rates and category characteristic curves for the jealousy scale in two panels, A and B. Therefore, it provides a detailed visual representation of how different items on the jealousy scale function across various levels of the latent trait, helping to identify which items are most effective in measuring jealousy and at what levels. Panel A includes nine bar charts (C1-C9) with response rates from 0 to 4. C7 has the most equitable distribution (~20% per category), while C9 has the largest disparity, peaking at 28% in category 0. Panel B displays probability curves for each category against the latent trait (q). As q increases, the probability of higher category responses rises, with notable variations across datasets (C1-C9). These patterns highlight the relationship between the latent trait and response choices in jealousy-provoking situations.
Goodness of fit and reliability
Table 1 summarises the fit statistics for the 2PL graded response models. In fact, this Table illustrates how the exclusion of problematic items can lead to a more accurate and reliable model, making it easier to understand how the remaining items contribute to the overall measurement of the latent trait. Thus, Model M1 shows poor fit with an RMSEA of .133. Removing problematic items in M2 improves RMSEA to .077 but still falls short of the acceptable .089, with Q3* above 0.20. In M3, excluding item 5 results in an RMSEA of .079, reflecting adequate fit. CFI and TLI indices are near 1 in both M2 and M3, indicating good fit. BIC decreases across models, favouring M3. M3’s Q3* indicates satisfactory one-dimensionality at .17. Empirical reliability (rxx) is good in all models.
Model parameters and effect size of estimates
Table 2 presents item statistics for the jealousy graded response model. Thus, this Table provides a clear view of how each item in the scale contributes to measuring jealousy, highlighting the discrimination parameters and potential gender biases, which are crucial for understanding the scale’s precision and fairness in diverse populations. Discrimination parameters (a) range from 1.96 to 2.86, indicating a moderate to strong relationship between the latent trait and item responses. The RMSEA is .000 for most items, except for item C8, which has .020, indicating an excellent fit. All p-values are above 0.05, suggesting adequate model fit. dDIF values for items C3 to C9 range from 0.19 for C4 to -0.18 for C8, showing significant differences in item functioning between men and women. Items C3 and C4 show slight bias towards women, with dDIF values of 0.24 and 0.19, respectively, while items C6, C8, and C9 show biases towards men. These differences, though statistically significant, are relatively small and may not be practically significant.
Note. dDIF: Compensatory Differential Response Function (sex variable); a: discrimination parameter; b: localisation parameter.
Figure 2 shows the information function of the individual items (C1 to C5) and of the entire jealousy scale. Thus, this figure illustrates how the scale and its items function at different levels of the latent trait, which provides information about the accuracy of the measure and helps to identify the areas in which the scale is more and less precise. Each item plot indicates where it is most informative relative to the latent trait (q), highlighting the levels where measurement precision is highest, and error is lowest. The combined test plot reveals that the scale is most accurate around the centre of the q distribution, with maximal information and minimal standard error, decreasing as q moves away from the centre.
Discussion
Jealousy in romantic relationships has been extensively studied (Park et al., 2024; Pichon et al., 2020; Pollet & Saxton, 2020) due to its negative impact on emotional well-being (Ahlen et al., 2023; Kaufman-Parks et al., 2023)and its association with relationship satisfaction (Elphinston et al., 2013; Himawan, 2017; Ventura-León et al., 2023; Ventura-León & Lino-Cruz, 2023). The Brief Jealousy Scale (BJS) is essential for identifying and assessing jealousy in romantic relationships. Item Response Theory (IRT) was chosen for its precise assessment of item scores against the latent trait (Bean & Bowen, 2021; DeVellis, 2006; Richardson et al., 2007). A valid and reliable jealousy tool is necessary for future research in the Peruvian context. The adoption of rigorous psychometric approaches, as illustrated by Diotaiuti et al. (2021) in their study on measurement invariance, supports the importance of confirming the metric goodness of tools in diverse populations.
The analyses of the BJS consistently demonstrated a single-factor structure, aligning with the original study (Ventura-León et al., 2018). Jealousy, an emotion driven by a desire for exclusivity, can manifest in various ways (Echeburúa & Fernández-Montalvo, 2001). Some items showed high relationships in the residual matrix and were removed to improve the scale’s accuracy. Items 1 and 2 were eliminated because they focused on attention and exclusivity rather than jealousy. Items 5 (‘If I find my partner openly flirting...’) and 7 (‘If my partner receives calls...’) were removed as they focus on potential infidelity behaviors, rather than on emotions of jealousy (Jeanfreau & Mong, 2019). This refinement enhances the scale’s ability to capture the general construct of jealousy and its cognitive, emotional, and behavioural aspects (Echeburúa et al., 2009; Guerrero et al., 2005), reinforcing the BJS’s validity in measuring jealousy in romantic relationships.
The reliability of the BJS was evaluated using empirical reliability (rxx), which considers factor scores and model estimates (Du Toit, 2003). In all three models, reliability was above .80. The test and item information functions from the IRT model showed the BJS provides the highest measurement accuracy at moderate jealousy levels, effectively differentiating between degrees of the trait (Zickar & Broadfoot, 2009). This approach enhances the scale’s robustness, supporting its use in assessing jealousy across diverse romantic relationships (Bernhard, 1986; DeSteno et al., 2006). These findings underscore the BJS’s validity and reliability for both research and practical interventions.
The DIF results show small but significant biases in items C3 to C9 between men and women. Items C3 and C4 are slightly biased towards women, consistent with research indicating women experience more emotional jealousy (Pollet & Saxton, 2020; Valentova et al., 2020). Items C6, C8, and C9 are biased towards men, aligning with men’s tendency for sexual jealousy (Edlund et al., 2019). This pattern reflects gender differences in perceptions of infidelity and competition (Kyegombe et al., 2022; Larsen et al., 2021; Toplu-Demirtaş et al., 2022). The findings suggest the BJS fairly assesses jealousy in both men and women.
This study provides key insights into the BJS and its relevance in romantic relationships, particularly in the Peruvian context. It explores romantic jealousy, a prevalent construct with adverse effects on emotional well-being and social relationships (Ahlen et al., 2023; Kaufman-Parks et al., 2023). The research enhances understanding of how jealousy, assessed through the BJS, influences relationship dynamics and individual well-being. Jealousy is fundamental in romantic relationships, felt across ages, orientations, classes, cultures, and relationship types (Bernhard, 1986; DeSteno et al., 2006). The BJS effectively measures jealousy from a non-pathological perspective, aiding in the understanding of relationship dynamics in Peru. Analysing the BJS’s psychometric properties provides tools for identifying and evaluating jealousy in Peruvian romantic relationships, facilitating interventions and strategies for healthy behaviour. Understanding jealousy can help reduce violence, promote gender equality, and enhance psychological well-being (Buller et al., 2023; Kyegombe et al., 2022).
During the evaluation of the BJS in Peruvian couples, several methodological limitations were identified that may influence the study’s results and conclusions. Firstly, the use of non-probabilistic sampling limits the generalisability of our findings. This limitation suggests that our results might not fully represent the broader population, underscoring the need for random sampling in future research to enhance external validity. Secondly, the reliance on virtual data collection complicates the verification of inclusion criteria and may introduce self-selection bias (Hoerger & Currell, 2012), potentially skewing the data towards individuals more comfortable with online environments. To mitigate this, future studies could benefit from in-person data collection, which may reduce bias and ensure a more representative sample. Lastly, the relatively small sample size may have weakened the robustness of the psychometric analyses, particularly in the context of Item Response Theory (IRT). This limitation suggests that our conclusions regarding the BJS’s psychometric properties should be interpreted with caution. Future research with larger samples is necessary to provide more definitive evidence and strengthen the reliability of IRT analyses (Bean & Bowen, 2021; DeVellis, 2006; Whittaker & Worthington, 2016).
Jealousy in romantic relationships significantly impacts emotional well-being and satisfaction. The BJS is a valid and reliable tool for measuring jealousy in Peru. Its psychometric analysis using Item Response Theory confirms that the BJS effectively captures moderate jealousy levels, with item elimination sharpening its focus. DIF analysis reveals gender differences in item functioning, reflecting varied perceptions of infidelity and competition. The BJS’s strong validity and reliability make it a valuable tool for couple therapy and educational programmes to reduce violence and promote psychological health in relationships. This study enhances the understanding of romantic jealousy and its role in relationship dynamics, supporting interventions for healthier relationships and gender equality. Future research should employ random sampling, in-person data collection, and larger samples to improve generalisability.1 2 3 4 5 6 7