IGNOU BECS-184 Solved Question Paper PDF Download

The IGNOU BECS-184 Solved Question Paper PDF Download page is designed to help students access high-quality exam resources in one place. Here, you can find ignou solved question paper IGNOU Previous Year Question paper solved PDF that covers all important questions with detailed answers. This page provides IGNOU all Previous year Question Papers in one PDF format, making it easier for students to prepare effectively.

IGNOU BECS-184 Solved Question Paper in Hindi
IGNOU BECS-184 Solved Question Paper in English
IGNOU Previous Year Solved Question Papers (All Courses)

Whether you are looking for IGNOU Previous Year Question paper solved in English or ignou previous year question paper solved in hindi, this page offers both options to suit your learning needs. These solved papers help you understand exam patterns, improve answer writing skills, and boost confidence for upcoming exams.

IGNOU BECS-184 Solved Question Paper PDF

IGNOU Previous Year Solved Question Papers

This section provides IGNOU BECS-184 Solved Question Paper PDF in both Hindi and English. These ignou solved question paper IGNOU Previous Year Question paper solved PDF include detailed answers to help you understand exam patterns and improve your preparation. You can also access IGNOU all Previous year Question Papers in one PDF for quick and effective revision before exams.

IGNOU BECS-184 Previous Year Solved Question Paper in Hindi

Q1. केंद्रीय प्रवृत्ति के विभिन्न मापकों की व्याख्या कीजिए। औसत, माध्यिका और बहुलक के बीच क्या संबंध होता है? वह परिस्थितियाँ बताइए जहाँ माध्यिका और बहुलक उपयुक्त रहते हैं।

Ans.

केंद्रीय प्रवृत्ति के मापक वे सांख्यिकीय मान हैं जो किसी डेटा सेट के केंद्र या विशिष्ट मान का प्रतिनिधित्व करते हैं। ये मान डेटा के वितरण के केंद्र बिंदु को दर्शाते हैं। इसके मुख्य मापक निम्नलिखित हैं:

1. माध्य (Mean): इसे ‘औसत’ भी कहा जाता है। यह किसी डेटा सेट के सभी मानों के योग को मानों की कुल संख्या से विभाजित करके प्राप्त किया जाता है।

सूत्र: माध्य (μ या x̄) = (सभी मानों का योग) / (मानों की कुल संख्या) = Σx / n

यह सबसे अधिक उपयोग किया जाने वाला मापक है, लेकिन यह चरम मानों (outliers) से बहुत प्रभावित होता है।

2. माध्यिका (Median): माध्यिका वह मान है जो किसी डेटा सेट को, जब उसे आरोही या अवरोही क्रम में व्यवस्थित किया जाता है, ठीक दो बराबर भागों में विभाजित करता है।

यदि मानों की संख्या (n) विषम है, तो माध्यिका ((n+1)/2)वाँ मान होता है।
यदि मानों की संख्या (n) सम है, तो माध्यिका दो मध्य मानों का औसत होती है।

माध्यिका चरम मानों से प्रभावित नहीं होती है, इसलिए यह विषम (skewed) डेटा के लिए एक बेहतर मापक है।

3. बहुलक (Mode): बहुलक डेटा सेट में सबसे अधिक बार आने वाला मान है। एक डेटा सेट में एक से अधिक बहुलक (द्वि-बहुलक, त्रि-बहुलक) हो सकते हैं या कोई भी बहुलक नहीं हो सकता है। इसका उपयोग गुणात्मक (categorical) डेटा के लिए प्रमुख रूप से किया जाता है।

माध्य, माध्यिका और बहुलक के बीच संबंध:

एक मामूली विषम वितरण (moderately skewed distribution) के लिए, इन तीनों के बीच एक अनुभवजन्य (empirical) संबंध होता है, जिसे इस प्रकार व्यक्त किया जाता है:

बहुलक ≈ 3(माध्यिका) – 2(माध्य)

या

माध्य – बहुलक ≈ 3(माध्य – माध्यिका)

एक सममित वितरण (symmetrical distribution) में, माध्य = माध्यिका = बहुलक।
एक धनात्मक विषम वितरण (positively skewed) में, माध्य > माध्यिका > बहुलक।
एक ऋणात्मक विषम वितरण (negatively skewed) में, माध्य < माध्यिका < बहुलक।

माध्यिका और बहुलक की उपयुक्तता वाली परिस्थितियाँ:

माध्यिका उपयुक्त है जब:

डेटा सेट में चरम मान (outliers) हों, जैसे आय वितरण या संपत्ति के आँकड़े। माध्यिका इन चरम मानों से प्रभावित नहीं होती है और केंद्र का बेहतर प्रतिनिधित्व करती है।
डेटा का वितरण विषम (skewed) हो।
डेटा में खुले सिरे वाले वर्ग (open-ended classes) हों, जैसे “100 से अधिक”।

बहुलक उपयुक्त है जब:

डेटा गुणात्मक या श्रेणीबद्ध (categorical) हो, जैसे कि सबसे लोकप्रिय कार का रंग, पसंदीदा ब्रांड, या चुनावी सर्वेक्षण में सबसे अधिक वोट पाने वाला उम्मीदवार।
व्यापार में, यह जानना महत्वपूर्ण हो कि कौन सा उत्पाद आकार सबसे अधिक बिकता है (जैसे जूते या शर्ट का आकार)।
वितरण के सबसे सामान्य या विशिष्ट मान का पता लगाना हो।

Q2. (क) प्राथमिक आँकड़ा संकलन की विधियों को संक्षेप में बताइए। (ख) एक प्रतिदर्श सर्वेक्षण के नियोजन एवं व्यवस्था में आने वाले सोपानों की व्याख्या कीजिए।

Ans.

(क) प्राथमिक आँकड़ा संकलन की विधियाँ:

प्राथमिक आँकड़े वे आँकड़े होते हैं जो शोधकर्ता द्वारा पहली बार, विशेष रूप से अपने शोध उद्देश्य के लिए एकत्र किए जाते हैं। ये मूल और ताज़ा होते हैं। इनके संकलन की प्रमुख विधियाँ निम्नलिखित हैं:

1. प्रत्यक्ष व्यक्तिगत साक्षात्कार (Direct Personal Interview): इसमें अन्वेषक उत्तरदाताओं से सीधे संपर्क करता है और आमने-सामने प्रश्न पूछता है। यह विधि विश्वसनीय और विस्तृत जानकारी प्रदान करती है, लेकिन यह महंगी और समय लेने वाली होती है।

2. अप्रत्यक्ष मौखिक अन्वेषण (Indirect Oral Investigation): जब उत्तरदाताओं से सीधे संपर्क करना संभव नहीं होता, तो इस विधि का उपयोग किया जाता है। इसमें उन व्यक्तियों या गवाहों से जानकारी एकत्र की जाती है जिनके पास विषय से संबंधित जानकारी होती है।

3. प्रश्नावली विधि (Questionnaire Method): इसमें प्रश्नों की एक सूची (प्रश्नावली) तैयार की जाती है और उत्तरदाताओं को भरने के लिए भेजी जाती है। यह दो तरीकों से किया जा सकता है:

डाक प्रश्नावली (Mailed Questionnaire): प्रश्नावली डाक द्वारा भेजी जाती है। यह एक बड़े क्षेत्र को कवर करने के लिए सस्ती है, लेकिन प्रतिक्रिया दर अक्सर कम होती है।
गणकों द्वारा भरी गई प्रश्नावली (Questionnaire filled by Enumerators): इसमें प्रशिक्षित गणक उत्तरदाताओं के पास जाते हैं और स्वयं प्रश्नावली भरते हैं। इससे उच्च गुणवत्ता वाले आँकड़े प्राप्त होते हैं लेकिन यह महंगा होता है।

4. अवलोकन विधि (Observation Method): इस विधि में, शोधकर्ता व्यक्तियों या घटनाओं के व्यवहार को बिना किसी संवाद के सीधे देखता और रिकॉर्ड करता है। यह व्यवहार संबंधी अध्ययनों के लिए उपयोगी है, लेकिन यह व्यक्तिपरक हो सकता है।

5. टेलीफोन साक्षात्कार (Telephonic Interview): इसमें शोधकर्ता टेलीफोन पर उत्तरदाताओं से प्रश्न पूछता है। यह त्वरित और कम खर्चीला है, लेकिन यह केवल उन लोगों तक सीमित है जिनके पास टेलीफोन है और इसमें गैर-मौखिक संकेतों का अभाव होता है।

(ख) प्रतिदर्श सर्वेक्षण के नियोजन एवं व्यवस्था के सोपान:

एक प्रतिदर्श सर्वेक्षण (sample survey) एक व्यवस्थित प्रक्रिया है जिसमें जनसंख्या के एक छोटे हिस्से (प्रतिदर्श) का अध्ययन करके पूरी जनसंख्या के बारे में निष्कर्ष निकाले जाते हैं। इसके नियोजन और संगठन में निम्नलिखित चरण शामिल हैं:

1. उद्देश्यों को परिभाषित करना: सर्वेक्षण का पहला और सबसे महत्वपूर्ण कदम उसके उद्देश्यों को स्पष्ट रूप से परिभाषित करना है। यह जानना आवश्यक है कि क्या जानकारी एकत्र करनी है और क्यों।

2. समष्टि (Population) को परिभाषित करना: उस समूह को स्पष्ट रूप से परिभाषित करें जिसका अध्ययन किया जाना है। उदाहरण के लिए, यदि आप छात्रों के खर्च करने की आदतों का अध्ययन कर रहे हैं, तो समष्टि एक विशेष विश्वविद्यालय के सभी छात्र हो सकते हैं।

3. प्रतिदर्श ढाँचा (Sampling Frame) का निर्धारण: यह समष्टि की सभी इकाइयों की एक सूची है जहाँ से प्रतिदर्श का चयन किया जाएगा, जैसे मतदाता सूची या टेलीफोन डायरेक्टरी। एक अच्छा ढाँचा पूर्ण और अद्यतित होना चाहिए।

4. प्रतिचयन विधि का चयन: प्रतिदर्श का चयन करने की विधि तय करना। यह संभाव्यता प्रतिचयन (जैसे सरल यादृच्छिक, स्तरीकृत) या गैर-संभाव्यता प्रतिचयन (जैसे सुविधानुसार, कोटा) हो सकता है। विधि का चुनाव सटीकता, लागत और समय पर निर्भर करता है।

5. प्रतिदर्श आकार का निर्धारण: यह तय करना कि सर्वेक्षण में कितने लोगों को शामिल किया जाएगा। प्रतिदर्श का आकार पर्याप्त बड़ा होना चाहिए ताकि परिणाम विश्वसनीय हों, लेकिन इतना भी बड़ा नहीं कि लागत और समय बहुत अधिक हो जाए।

6. प्रश्नावली का डिजाइन और पूर्व-परीक्षण: सर्वेक्षण के उद्देश्यों के आधार पर एक प्रभावी प्रश्नावली तैयार करना। प्रश्नों को स्पष्ट, निष्पक्ष और समझने में आसान होना चाहिए। मुख्य सर्वेक्षण से पहले एक छोटे समूह पर प्रश्नावली का पूर्व-परीक्षण (pre-testing) करना आवश्यक है ताकि किसी भी समस्या का पता लगाया जा सके।

7. क्षेत्र-कार्यकर्ताओं का प्रशिक्षण और संगठन: यदि गणकों का उपयोग किया जा रहा है, तो उन्हें उचित प्रशिक्षण देना आवश्यक है ताकि वे मानकीकृत तरीके से डेटा एकत्र कर सकें।

8. आँकड़ों का संग्रह: चयनित विधि का उपयोग करके प्रतिदर्श से आँकड़े एकत्र करना।

9. आँकड़ों का प्रसंस्करण और विश्लेषण: एकत्र किए गए आँकड़ों की जाँच, कोडिंग, सारणीकरण और फिर सांख्यिकीय तकनीकों का उपयोग करके उनका विश्लेषण करना।

10. रिपोर्ट लेखन: अंत में, निष्कर्षों को एक विस्तृत रिपोर्ट के रूप में प्रस्तुत करना, जिसमें कार्यप्रणाली, मुख्य निष्कर्ष और सिफारिशें शामिल हों।

Q3. शून्यक परिकल्पना तथा वैकल्पिक परिकल्पना में भेद कीजिए। बताइए कि एक परिकल्पना के सत्यापन में आप क्या कदम उठाएँगे।

Ans.

शून्यक परिकल्पना और वैकल्पिक परिकल्पना में भेद:

परिकल्पना परीक्षण में, दो प्रतिस्पर्धी परिकल्पनाएँ तैयार की जाती हैं: शून्यक परिकल्पना और वैकल्पिक परिकल्पना।

शून्यक परिकल्पना (Null Hypothesis, H₀):

यह एक ऐसा कथन है जो चरों के बीच कोई संबंध नहीं, समूहों के बीच कोई अंतर नहीं, या यथास्थिति का दावा करता है।
यह वह परिकल्पना है जिसे शोधकर्ता सांख्यिकीय साक्ष्यों के आधार पर अस्वीकार करने का प्रयास करता है।
इसे हमेशा समानता के रूप में व्यक्त किया जाता है, जैसे μ = 100, या p₁ = p₂।
उदाहरण: “एक नई दवा का रक्तचाप पर कोई प्रभाव नहीं पड़ता है।”

वैकल्पिक परिकल्पना (Alternative Hypothesis, H₁ or Hₐ):

यह एक ऐसा कथन है जो शून्यक परिकल्पना का खंडन करता है। यह वह दावा है जिसे शोधकर्ता साबित करना चाहता है।
यह चरों के बीच संबंध, समूहों के बीच अंतर, या किसी प्रभाव की उपस्थिति का सुझाव देता है।
इसे असमानता (≠), या एक दिशात्मक दावे (< or >) के रूप में व्यक्त किया जाता है।
उदाहरण: “एक नई दवा रक्तचाप को कम करती है।” (μ < μ₀) या “एक नई दवा का रक्तचाप पर प्रभाव पड़ता है।” (μ ≠ μ₀)

संक्षेप में, H₀ ‘कोई प्रभाव नहीं’ की स्थिति है, जबकि H₁ वह ‘प्रभाव’ है जिसे शोधकर्ता ढूंढ रहा है। परिकल्पना परीक्षण का उद्देश्य यह निर्धारित करना है कि क्या प्रतिदर्श डेटा H₀ को अस्वीकार करने के लिए पर्याप्त सबूत प्रदान करता है।

परिकल्पना के सत्यापन (परीक्षण) के कदम:

एक परिकल्पना का परीक्षण करने की प्रक्रिया में निम्नलिखित व्यवस्थित कदम शामिल हैं:

1. शून्यक और वैकल्पिक परिकल्पनाएँ निर्धारित करना (State H₀ and H₁): सबसे पहले, शोध प्रश्न के आधार पर शून्यक (H₀) और वैकल्पिक (H₁) परिकल्पनाओं को स्पष्ट रूप से परिभाषित करें। वैकल्पिक परिकल्पना एक-पूंछ (one-tailed) या दो-पूंछ (two-tailed) हो सकती है।

2. सार्थकता स्तर का चयन करना (Choose the Significance Level, α): सार्थकता स्तर (α) उस संभावना को दर्शाता है जब शून्यक परिकल्पना सत्य होने पर भी उसे अस्वीकार कर दिया जाता है (टाइप I त्रुटि)। आमतौर पर α का मान 0.05 (5%), 0.01 (1%), या 0.10 (10%) लिया जाता है।

3. उपयुक्त परीक्षण सांख्यिकी का चयन करना (Select the Appropriate Test Statistic): आँकड़ों की प्रकृति, प्रतिदर्श आकार और परिकल्पना के आधार पर सही सांख्यिकीय परीक्षण का चयन करें। उदाहरण के लिए, z-परीक्षण, t-परीक्षण, काई-स्क्वायर (χ²) परीक्षण, या F-परीक्षण।

4. निर्णय नियम स्थापित करना (Formulate the Decision Rule): परीक्षण सांख्यिकी के सैद्धांतिक वितरण के आधार पर ‘अस्वीकृति क्षेत्र’ (rejection region) या ‘क्रांतिक मान’ (critical value) का निर्धारण करें। यदि परिकलित परीक्षण सांख्यिकी इस क्षेत्र में आती है, तो H₀ को अस्वीकार कर दिया जाएगा।

5. परीक्षण सांख्यिकी की गणना करना (Compute the Test Statistic): प्रतिदर्श आँकड़ों का उपयोग करके चयनित परीक्षण सांख्यिकी के मान की गणना करें।

6. निर्णय लेना (Make a Decision): परिकलित सांख्यिकी की तुलना क्रांतिक मान से करें (या p-मान की तुलना α से करें)।

यदि परिकलित मान अस्वीकृति क्षेत्र में आता है (या यदि p-मान ≤ α), तो शून्यक परिकल्पना को अस्वीकार करें ।
यदि परिकलित मान अस्वीकृति क्षेत्र में नहीं आता है (या यदि p-मान > α), तो शून्यक परिकल्पना को अस्वीकार करने में विफल रहें ।

7. निष्कर्ष की व्याख्या करना (Interpret the Conclusion): अंत में, सांख्यिकीय निर्णय को शोध प्रश्न के संदर्भ में व्याख्या करें। बताएं कि आपके निष्कर्षों का व्यावहारिक महत्व क्या है।

Q4. साँझा सूचकांक क्या होता है? व्याख्या कीजिए कि एक साँझे सूचक की रचना किस प्रकार करते हैं।

Ans.

साँझा सूचकांक (Composite Index):

एक साँझा सूचकांक, जिसे समग्र सूचकांक भी कहा जाता है, एक सांख्यिकीय माप है जो समय, भौगोलिक स्थिति या अन्य विशेषताओं में संबंधित चरों के एक समूह में हुए परिवर्तनों को मापता है। यह कई अलग-अलग लेकिन संबंधित संकेतकों को एक एकल संख्यात्मक मान में जोड़ता है ताकि एक जटिल अवधारणा को सरल बनाया जा सके और उसकी निगरानी की जा सके। इसका उद्देश्य कई चरों में समग्र प्रवृत्ति या परिवर्तन को एक नज़र में प्रस्तुत करना है।

उदाहरण के लिए:

उपभोक्ता मूल्य सूचकांक (CPI): यह विभिन्न वस्तुओं और सेवाओं (जैसे भोजन, कपड़े, आवास) की कीमतों को मिलाकर जीवन यापन की लागत में परिवर्तन को मापता है।
मानव विकास सूचकांक (HDI): यह जीवन प्रत्याशा, शिक्षा और प्रति व्यक्ति आय जैसे संकेतकों को मिलाकर किसी देश के सामाजिक और आर्थिक विकास के स्तर को मापता है।
शेयर बाजार सूचकांक (जैसे सेंसेक्स, निफ्टी): यह कई प्रमुख कंपनियों के शेयरों के मूल्यों को मिलाकर बाजार के समग्र प्रदर्शन को दर्शाता है।

साँझे सूचक की रचना के चरण:

एक साँझे सूचकांक का निर्माण एक जटिल प्रक्रिया है जिसमें कई निर्णय शामिल होते हैं। इसके मुख्य चरण निम्नलिखित हैं:

1. उद्देश्य और क्षेत्र को परिभाषित करना: सबसे पहले, यह स्पष्ट रूप से परिभाषित करना आवश्यक है कि सूचकांक क्या मापने का इरादा रखता है। उदाहरण के लिए, क्या यह मूल्य स्तर, औद्योगिक उत्पादन, या सामाजिक कल्याण को मापेगा? इसका दायरा (भौगोलिक, क्षेत्रवार) भी निर्धारित किया जाना चाहिए।

2. मदों (Items) का चयन: सूचकांक में शामिल किए जाने वाले मदों या संकेतकों का सावधानीपूर्वक चयन करें। ये मदें उस अवधारणा का प्रतिनिधि होनी चाहिए जिसे मापा जा रहा है। उदाहरण के लिए, CPI के लिए, उन वस्तुओं और सेवाओं का चयन किया जाता है जो एक सामान्य परिवार द्वारा उपभोग की जाती हैं।

3. आधार अवधि (Base Period) का चयन: एक आधार अवधि का चयन करना आवश्यक है जिसके साथ तुलना की जाएगी। यह अवधि सामान्य और हाल की होनी चाहिए, जिसमें कोई बड़ी असामान्यता (जैसे युद्ध, अकाल) न हो। आधार अवधि का सूचकांक मान हमेशा 100 होता है।

4. आँकड़ों का संग्रह: चयनित मदों के लिए आँकड़े एकत्र करना, जैसे कि कीमतें, मात्राएँ या अन्य मान। ये आँकड़े विश्वसनीय स्रोतों से प्राप्त किए जाने चाहिए।

5. भार (Weights) का निर्धारण: सूचकांक में प्रत्येक मद को उसके सापेक्ष महत्व के अनुसार एक भार (weight) दिया जाता है। उदाहरण के लिए, CPI में, भोजन पर होने वाले खर्च को कपड़ों पर होने वाले खर्च की तुलना में अधिक भार दिया जाता है क्योंकि परिवार अपनी आय का एक बड़ा हिस्सा भोजन पर खर्च करते हैं। भार मात्रा या मूल्य के आधार पर हो सकते हैं।

6. उपयुक्त सूत्र का चयन: सूचकांक की गणना के लिए एक उपयुक्त औसत या सूत्र का चयन करना। भारित सूचकांकों के लिए सामान्य सूत्र हैं:

लाспейरेस सूचकांक (Laspeyres’ Index): यह आधार अवधि की मात्रा को भार के रूप में उपयोग करता है।
पाश्चे सूचकांक (Paasche’s Index): यह वर्तमान अवधि की मात्रा को भार के रूप में उपयोग करता है।
फिशर का आदर्श सूचकांक (Fisher’s Ideal Index): यह लाспейरेस और पाश्चे सूचकांकों का ज्यामितीय माध्य है और इसे सांख्यिकीय रूप से सर्वश्रेष्ठ माना जाता है।

7. सूचकांक की गणना: चयनित सूत्र और भार का उपयोग करके अंतिम सूचकांक मान की गणना करना। यह मान आधार अवधि के सापेक्ष समग्र परिवर्तन को दर्शाता है।

Q5. एकल दण्डचित्र तथा आयतचित्र में भेद समझाइए। अग्रांकित आँकड़ों के आधार पर एक आयतचित्र बनाइए : संख्यात्मक योग्यता में अंक / प्रायिकता 0-20 / 5 20-30 / 30 30-40 / 50 40-50 / 35 50-60 / 20

Ans.

एकल दण्डचित्र (Simple Bar Diagram) और आयतचित्र (Histogram) में भेद:

एकल दण्डचित्र और आयतचित्र दोनों ही आँकड़ों को ग्राफिक रूप से प्रस्तुत करने के तरीके हैं, लेकिन वे विभिन्न प्रकार के आँकड़ों के लिए उपयोग किए जाते हैं और उनकी बनावट में महत्वपूर्ण अंतर होते हैं।

विशेषता एकल दण्डचित्र (Simple Bar Diagram) आयतचित्र (Histogram) आँकड़ों का प्रकार इसका उपयोग असंतत (discrete) या गुणात्मक (categorical) आँकड़ों के लिए किया जाता है। जैसे – विभिन्न देशों की जनसंख्या, विभिन्न वर्षों में उत्पादन। इसका उपयोग सतत (continuous) आँकड़ों के लिए किया जाता है जो वर्ग अंतरालों (class intervals) में समूहीकृत होते हैं। दण्डों के बीच का स्थान दण्डों (bars) के बीच समान अंतर (gap) होता है। प्रत्येक दण्ड एक अलग श्रेणी का प्रतिनिधित्व करता है। दण्ड एक दूसरे से सटे हुए होते हैं, कोई अंतर नहीं होता है, क्योंकि यह एक सतत श्रृंखला का प्रतिनिधित्व करता है। दण्डों की चौड़ाई दण्डों की चौड़ाई का कोई विशेष महत्व नहीं होता है, लेकिन वे सभी समान चौड़ाई के होने चाहिए। दण्डों की चौड़ाई वर्ग अंतराल की चौड़ाई को दर्शाती है। यह असमान भी हो सकती है। X-अक्ष X-अक्ष पर अलग-अलग श्रेणियाँ या असंतत मान दिखाए जाते हैं। X-अक्ष एक सतत संख्या रेखा होती है जो वर्ग अंतरालों को दर्शाती है। क्षेत्रफल बनाम ऊँचाई दण्ड की ऊँचाई (या लंबाई) आवृत्ति (frequency) के समानुपाती होती है। दण्ड का क्षेत्रफल आवृत्ति के समानुपाती होता है। यदि वर्ग अंतराल समान हैं, तो ऊँचाई आवृत्ति के समानुपाती होती है।

आयतचित्र का निर्माण:

दिए गए आँकड़े सतत हैं और वर्ग अंतरालों में समूहीकृत हैं, इसलिए एक आयतचित्र बनाया जा सकता है।

(नोट: प्रश्न में “प्रायिकता” शब्द का प्रयोग किया गया है, जो संभवतः एक टंकण त्रुटि है। इसका सही अर्थ “आवृत्ति” (Frequency) है।)

आँकड़े:

संख्यात्मक योग्यता में अंक (वर्ग अंतराल): 0-20, 20-30, 30-40, 40-50, 50-60
आवृत्ति (Frequency): 5, 30, 50, 35, 20

आयतचित्र बनाने के चरण:

1. एक ग्राफ पेपर पर दो अक्ष बनाएँ: क्षैतिज अक्ष (X-अक्ष) और ऊर्ध्वाधर अक्ष (Y-अक्ष)।

2. X-अक्ष पर , ‘संख्यात्मक योग्यता में अंक’ (Scores in numerical ability) को एक उपयुक्त पैमाने पर अंकित करें। यहाँ, हम 0, 10, 20, 30, 40, 50, 60 को समान दूरी पर चिह्नित करेंगे। यह अक्ष वर्ग अंतरालों का प्रतिनिधित्व करेगा।

3. Y-अक्ष पर , ‘आवृत्ति’ (Frequency) को एक उपयुक्त पैमाने पर अंकित करें। चूँकि अधिकतम आवृत्ति 50 है, हम 0 से 55 तक 5 या 10 के अंतराल में मान अंकित कर सकते हैं (जैसे 0, 10, 20, 30, 40, 50)।

4. अब, प्रत्येक वर्ग अंतराल के लिए एक आयताकार दण्ड बनाएँ। दण्डों के बीच कोई अंतर नहीं होगा क्योंकि डेटा सतत है।

0-20: X-अक्ष पर 0 से 20 तक की चौड़ाई का एक दण्ड बनाएँ जिसकी ऊँचाई Y-अक्ष पर 5 के अनुरूप हो।
20-30: X-अक्ष पर 20 से 30 तक की चौड़ाई का एक दण्ड बनाएँ जिसकी ऊँचाई 30 हो। यह पहले दण्ड से सटा हुआ होगा।
30-40: X-अक्ष पर 30 से 40 तक की चौड़ाई का एक दण्ड बनाएँ जिसकी ऊँचाई 50 हो।
40-50: X-अक्ष पर 40 से 50 तक की चौड़ाई का एक दण्ड बनाएँ जिसकी ऊँचाई 35 हो।
50-60: X-अक्ष पर 50 से 60 तक की चौड़ाई का एक दण्ड बनाएँ जिसकी ऊँचाई 20 हो।

यह निर्मित ग्राफ़ दिए गए आवृत्ति वितरण का आयतचित्र होगा। ग्राफ का शीर्षक “संख्यात्मक योग्यता स्कोर का आयतचित्र” रखा जा सकता है।

Q6. व्याख्या कीजिए कि परिकल्पना परीक्षण में p-मान विधि का प्रयोग किस प्रकार करते हैं।

Ans.

p-मान (p-value) परिकल्पना परीक्षण में एक महत्वपूर्ण अवधारणा है। यह एक संभाव्यता मान है जो यह मापता है कि यदि शून्यक परिकल्पना (H₀) वास्तव में सत्य है, तो देखे गए प्रतिदर्श परिणामों या उससे अधिक चरम परिणामों को प्राप्त करने की कितनी संभावना है। सरल शब्दों में, यह आपके डेटा और शून्यक परिकल्पना के बीच की असंगति का एक माप है।

p-मान विधि का परिकल्पना परीक्षण में प्रयोग:

p-मान विधि पारंपरिक क्रांतिक मान (critical value) विधि का एक विकल्प है और आधुनिक सांख्यिकीय सॉफ्टवेयर में इसका व्यापक रूप से उपयोग किया जाता है। इस विधि का उपयोग करने के चरण निम्नलिखित हैं:

1. परिकल्पनाएँ निर्धारित करें: सबसे पहले, शून्यक परिकल्पना (H₀) और वैकल्पिक परिकल्पना (H₁) को स्पष्ट रूप से परिभाषित करें। H₀ ‘कोई प्रभाव नहीं’ की स्थिति को दर्शाती है, जबकि H₁ उस प्रभाव को दर्शाती है जिसे शोधकर्ता साबित करना चाहता है।

2. सार्थकता स्तर (α) का चयन करें: एक सार्थकता स्तर (alpha) चुनें, जो टाइप I त्रुटि (एक सत्य H₀ को अस्वीकार करना) करने की अधिकतम स्वीकार्य संभावना है। सामान्य मान 0.05, 0.01, या 0.10 हैं। यह निर्णय लेने के लिए एक सीमा के रूप में कार्य करता है।

3. परीक्षण सांख्यिकी की गणना करें: अपने प्रतिदर्श डेटा के आधार पर उपयुक्त परीक्षण सांख्यिकी (जैसे t-statistic, z-statistic, chi-square statistic) की गणना करें। यह मान मापता है कि आपका प्रतिदर्श परिणाम H₀ के तहत अपेक्षित परिणाम से कितना दूर है।

4. p-मान की गणना करें: परिकलित परीक्षण सांख्यिकी के आधार पर, p-मान ज्ञात करें। p-मान वह क्षेत्र (या प्रायिकता) है जो परीक्षण सांख्यिकी के वितरण में आपके परिकलित मान से परे है।

एक-पूंछ परीक्षण (One-tailed test): p-मान वितरण के एक सिरे पर स्थित क्षेत्र है।
दो-पूंछ परीक्षण (Two-tailed test): p-मान वितरण के दोनों सिरों पर स्थित कुल क्षेत्र है।

5. निर्णय नियम (Decision Rule) लागू करें: p-मान की तुलना सार्थकता स्तर (α) से करें।

यदि p-मान ≤ α: इसका मतलब है कि देखे गए परिणाम इतने दुर्लभ हैं (यदि H₀ सत्य है) कि हम शून्यक परिकल्पना को अस्वीकार कर देते हैं। परिणाम “सांख्यिकीय रूप से सार्थक” (statistically significant) माना जाता है। यह वैकल्पिक परिकल्पना के पक्ष में मजबूत सबूत प्रदान करता है।
यदि p-मान > α: इसका मतलब है कि देखे गए परिणाम संयोग से हो सकते हैं (यदि H₀ सत्य है)। इसलिए, हम शून्यक परिकल्पना को अस्वीकार करने में विफल रहते हैं। परिणाम “सांख्यिकीय रूप से सार्थक नहीं” माना जाता है।

उदाहरण:

मान लीजिए एक शोधकर्ता यह परीक्षण करना चाहता है कि क्या एक नई शिक्षण विधि छात्रों के स्कोर में सुधार करती है। H₀ होगी “कोई सुधार नहीं” और H₁ होगी “सुधार है”। वे α = 0.05 चुनते हैं। विश्लेषण के बाद, वे p-मान = 0.023 पाते हैं।

चूंकि p-मान (0.023) ≤ α (0.05) , शोधकर्ता शून्यक परिकल्पना को अस्वीकार कर देगा। वे यह निष्कर्ष निकालेंगे कि इस बात के पर्याप्त सांख्यिकीय सबूत हैं कि नई शिक्षण विधि प्रभावी है।

p-मान विधि का एक फायदा यह है कि यह केवल यह नहीं बताती कि परिणाम सार्थक है या नहीं, बल्कि यह सार्थकता की ‘शक्ति’ का भी संकेत देती है। एक बहुत छोटा p-मान (जैसे 0.001) H₀ के विरुद्ध बहुत मजबूत सबूत इंगित करता है।

Q7. सहसंबंध एवं कारण-प्रभाव में भेद कीजिए। व्याख्या कीजिए कि चरों [X = कारण और Y = प्रभाव] के बीच संबंध का विश्लेषण किस प्रकार करते हैं।

Ans.

सहसंबंध (Correlation) और कारण-प्रभाव (Causation) में भेद:

सहसंबंध (Correlation):

सहसंबंध दो चरों के बीच एक सांख्यिकीय संबंध को संदर्भित करता है। यह मापता है कि दो चर एक दूसरे के साथ किस हद तक बदलते हैं।
यह संबंध की दिशा (धनात्मक या ऋणात्मक) और शक्ति (मजबूत या कमजोर) को इंगित करता है।
धनात्मक सहसंबंध: जब एक चर बढ़ता है, तो दूसरा भी बढ़ता है (जैसे, व्यायाम और कैलोरी बर्न)।
ऋणात्मक सहसंबंध: जब एक चर बढ़ता है, तो दूसरा घटता है (जैसे, तापमान और गर्म कपड़ों की बिक्री)।
सहसंबंध गुणांक (r) का मान -1 से +1 के बीच होता है, जहाँ +1 पूर्ण धनात्मक सहसंबंध, -1 पूर्ण ऋणात्मक सहसंबंध, और 0 कोई सहसंबंध नहीं दर्शाता है।
महत्वपूर्ण बिंदु: सहसंबंध का अर्थ कारण-प्रभाव नहीं है। दो चरों का एक साथ बदलना यह साबित नहीं करता कि एक चर दूसरे का कारण है।

कारण-प्रभाव (Causation):

कारण-प्रभाव, जिसे कार्य-कारण संबंध भी कहते हैं, एक ऐसी स्थिति को संदर्भित करता है जहाँ एक घटना (कारण) दूसरी घटना (प्रभाव) के घटित होने का परिणाम होती है।
यहाँ, एक चर में परिवर्तन सीधे दूसरे चर में परिवर्तन का कारण बनता है।
उदाहरण: धूम्रपान (कारण) से फेफड़ों का कैंसर (प्रभाव) होता है। बारिश होने (कारण) से जमीन गीली हो जाती है (प्रभाव)।
कारण-प्रभाव स्थापित करना सहसंबंध स्थापित करने से कहीं अधिक कठिन है।

उदाहरण: गर्मियों में आइसक्रीम की बिक्री और डूबने की घटनाओं की संख्या के बीच एक मजबूत धनात्मक सहसंबंध होता है। लेकिन इसका मतलब यह नहीं है कि आइसक्रीम खाने से डूबना होता है। यहाँ एक तीसरा चर, गर्म मौसम (confounding variable) , दोनों का कारण है। गर्मी में लोग अधिक आइसक्रीम खाते हैं और अधिक तैरने जाते हैं, जिससे डूबने की घटनाएँ बढ़ती हैं।

X (कारण) और Y (प्रभाव) के बीच संबंध का विश्लेषण:

यह स्थापित करने के लिए कि चर X, चर Y का कारण है, केवल सहसंबंध पर्याप्त नहीं है। एक कारण-प्रभाव संबंध का विश्लेषण करने और उसे स्थापित करने के लिए निम्नलिखित मानदंडों को पूरा किया जाना चाहिए:

1. सहसंबंध की स्थापना (Establish Correlation): पहला कदम यह दिखाना है कि X और Y के बीच एक सांख्यिकीय रूप से महत्वपूर्ण सहसंबंध मौजूद है। यदि कोई संबंध ही नहीं है, तो कारण-प्रभाव नहीं हो सकता। इसका विश्लेषण सहसंबंध गुणांक या प्रतिगमन विश्लेषण (regression analysis) के माध्यम से किया जा सकता है।

2. कालिक पूर्वता (Temporal Precedence): कारण को प्रभाव से पहले आना चाहिए। इसका मतलब है कि चर X में परिवर्तन चर Y में परिवर्तन से पहले होना चाहिए। अनुदैर्ध्य अध्ययन (longitudinal studies) इस मानदंड को स्थापित करने में मदद कर सकते हैं, जहाँ समय के साथ चरों को मापा जाता है।

3. वैकल्पिक स्पष्टीकरणों को खारिज करना (Rule out Alternative Explanations): यह सबसे चुनौतीपूर्ण कदम है। हमें यह सुनिश्चित करना होगा कि X और Y के बीच देखे गए संबंध किसी तीसरे, भ्रमित करने वाले चर (confounding variable) के कारण नहीं हैं। इसके लिए दो मुख्य दृष्टिकोण हैं:

नियंत्रित प्रयोग (Controlled Experiments): यह कारण-प्रभाव स्थापित करने का ‘स्वर्ण मानक’ है। एक प्रयोग में, शोधकर्ता जानबूझकर चर X (स्वतंत्र चर) में हेरफेर करता है और अन्य सभी संभावित चरों को स्थिर रखता है, फिर चर Y (आश्रित चर) पर इसके प्रभाव को मापता है। यादृच्छिक असाइनमेंट (random assignment) यह सुनिश्चित करने में मदद करता है कि उपचार और नियंत्रण समूह शुरू में समान हैं।
उन्नत सांख्यिकीय नियंत्रण (Advanced Statistical Control): जब प्रयोग संभव नहीं होते (जैसे अर्थशास्त्र या समाजशास्त्र में), तो बहु प्रतिगमन विश्लेषण (multiple regression analysis) जैसी सांख्यिकीय तकनीकों का उपयोग किया जाता है। इस विधि में, शोधकर्ता गणितीय रूप से अन्य संभावित भ्रमित करने वाले चरों के प्रभाव को ‘नियंत्रित’ करता है ताकि X और Y के बीच शुद्ध संबंध का आकलन किया जा सके।

4. एक तार्किक तंत्र (A Logical Mechanism): एक विश्वसनीय सिद्धांत या तंत्र होना चाहिए जो यह समझा सके कि X, Y का कारण कैसे बनता है। यह सिद्धांत विश्लेषण को विश्वसनीयता प्रदान करता है।

Q8. द्वितीयक आँकड़ों के स्रोतों और उनकी सीमाओं का संक्षिप्त विवरण दीजिए।

Ans.

द्वितीयक आँकड़े (Secondary Data):

द्वितीयक आँकड़े वे आँकड़े होते हैं जो पहले से ही किसी और व्यक्ति या संगठन द्वारा किसी अन्य उद्देश्य के लिए एकत्र और संसाधित किए जा चुके होते हैं। शोधकर्ता इन आँकड़ों का उपयोग अपने स्वयं के शोध प्रश्न का उत्तर देने के लिए करता है। ये प्राथमिक आँकड़ों के विपरीत होते हैं, जिन्हें शोधकर्ता स्वयं एकत्र करता है। द्वितीयक आँकड़े आसानी से उपलब्ध, सस्ते और समय बचाने वाले हो सकते हैं।

द्वितीयक आँकड़ों के स्रोत:

द्वितीयक आँकड़ों के कई स्रोत हैं, जिन्हें मोटे तौर पर प्रकाशित और अप्रकाशित स्रोतों में वर्गीकृत किया जा सकता है।

A. प्रकाशित स्रोत (Published Sources):

सरकारी प्रकाशन: केंद्र और राज्य सरकारें विभिन्न प्रकार के आँकड़े एकत्र और प्रकाशित करती हैं।
- भारत की जनगणना (Census of India): जनसंख्या, साक्षरता, लिंगानुपात आदि पर विस्तृत जानकारी।
- राष्ट्रीय प्रतिदर्श सर्वेक्षण संगठन (NSSO): रोजगार, बेरोजगारी, उपभोग व्यय, स्वास्थ्य आदि पर सर्वेक्षण।
- भारतीय रिजर्व बैंक (RBI): बैंकिंग, वित्त, मुद्रास्फीति और अन्य आर्थिक संकेतकों पर बुलेटिन और रिपोर्ट।
- केंद्रीय सांख्यिकी कार्यालय (CSO): राष्ट्रीय आय, औद्योगिक उत्पादन आदि पर आँकड़े।
अंतर्राष्ट्रीय प्रकाशन:
- विश्व बैंक (World Bank): विभिन्न देशों के विकास संकेतकों पर डेटा।
- अंतर्राष्ट्रीय मुद्रा कोष (IMF): अंतर्राष्ट्रीय वित्तीय आँकड़े।
- संयुक्त राष्ट्र संगठन (UNO): जनसंख्या, सामाजिक और आर्थिक विषयों पर वैश्विक डेटा।
व्यापार, उद्योग और वित्तीय प्रकाशन: विभिन्न उद्योग संघ (जैसे फिक्की, सीआईआई) और स्टॉक एक्सचेंज अपने संबंधित क्षेत्रों पर डेटा प्रकाशित करते हैं।
अनुसंधान संस्थानों और विश्वविद्यालयों की रिपोर्टें: अकादमिक संस्थान विभिन्न विषयों पर किए गए शोधों के निष्कर्षों को प्रकाशित करते हैं।
पत्रिकाएँ, समाचार पत्र और वेबसाइट्स: ये स्रोत सामयिक घटनाओं और विभिन्न विषयों पर आँकड़े प्रदान करते हैं।

B. अप्रकाशित स्रोत (Unpublished Sources):

इसमें वे आँकड़े शामिल हैं जो एकत्र तो किए गए हैं लेकिन प्रकाशित नहीं हुए हैं। ये सरकारी विभागों, निजी फर्मों, ट्रेड यूनियनों और अन्य संगठनों के आंतरिक रिकॉर्ड में पाए जा सकते हैं। इन तक पहुँचना कठिन हो सकता है।

द्वितीयक आँकड़ों की सीमाएँ:

अनुपयुक्तता: चूँकि आँकड़े किसी और उद्देश्य के लिए एकत्र किए गए थे, हो सकता है कि वे वर्तमान शोध की आवश्यकताओं के अनुरूप न हों। परिभाषाएँ, इकाइयाँ, या कवरेज भिन्न हो सकते हैं।
पुराने आँकड़े: आँकड़े पुराने हो सकते हैं और वर्तमान स्थिति को प्रतिबिंबित नहीं कर सकते हैं।
विश्वसनीयता का अभाव: आँकड़े एकत्र करने वाले स्रोत की विश्वसनीयता संदिग्ध हो सकती है। यह जानना मुश्किल हो सकता है कि डेटा किस विधि से, कितने पूर्वाग्रह के साथ, और किस स्तर की सटीकता के साथ एकत्र किया गया था।
अपर्याप्त जानकारी: आँकड़ों में वर्तमान शोध के लिए आवश्यक विवरण या वर्गीकरण का अभाव हो सकता है। उदाहरण के लिए, यदि आपको ग्रामीण और शहरी आय के आँकड़े चाहिए, लेकिन उपलब्ध आँकड़े केवल कुल आय दिखाते हैं।
पूर्वाग्रह: डेटा एकत्र करने वाले संगठन का अपना पूर्वाग्रह हो सकता है, जो आँकड़ों की निष्पक्षता को प्रभावित कर सकता है।

इसलिए, द्वितीयक आँकड़ों का उपयोग करने से पहले, शोधकर्ता को उनकी उपयुक्तता, विश्वसनीयता, पर्याप्तता और सटीकता का सावधानीपूर्वक मूल्यांकन करना चाहिए।

Q9. बहुचर आँकड़ों के विश्लेषण की तकनीकों का संक्षिप्त विवरण दीजिए।

Ans.

बहुचर आँकड़े (Multivariate Data) वे आँकड़े होते हैं जिनमें प्रत्येक अवलोकन (observation) के लिए दो से अधिक चरों (variables) को मापा जाता है। बहुचर विश्लेषण तकनीकों का उद्देश्य इन चरों के बीच के जटिल संबंधों को एक साथ समझना है। ये तकनीकें एकलचर (univariate) और द्विचर (bivariate) विश्लेषण की तुलना में वास्तविकता का अधिक यथार्थवादी प्रतिनिधित्व प्रदान करती हैं।

बहुचर आँकड़ों के विश्लेषण की प्रमुख तकनीकें निम्नलिखित हैं:

1. बहु प्रतिगमन विश्लेषण (Multiple Regression Analysis):

यह सबसे अधिक उपयोग की जाने वाली बहुचर तकनीकों में से एक है।
इसका उपयोग एक आश्रित चर (dependent variable) पर दो या दो से अधिक स्वतंत्र चरों (independent variables) के संयुक्त प्रभाव का अध्ययन करने और आश्रित चर के मान का पूर्वानुमान लगाने के लिए किया जाता है।
उदाहरण: किसी व्यक्ति की आय (आश्रित चर) को उसकी शिक्षा, अनुभव, आयु और लिंग (स्वतंत्र चर) के आधार पर समझना।

2. कारक विश्लेषण (Factor Analysis):

इसका उद्देश्य बड़ी संख्या में परस्पर संबंधित चरों को कम संख्या में अंतर्निहित, न देखे जा सकने वाले कारकों (underlying factors) में बदलना है।
यह डेटा में पैटर्न और संरचना की पहचान करके डेटा को सरल बनाने में मदद करता है।
उदाहरण: एक सर्वेक्षण में पूछे गए कई प्रश्नों (जैसे गुणवत्ता, मूल्य, सेवा) के आधार पर “ग्राहक संतुष्टि” नामक एक अंतर्निहित कारक की पहचान करना।

3. गुच्छ विश्लेषण (Cluster Analysis):

इस तकनीक का उपयोग अवलोकनों या वस्तुओं को उनके गुणों या विशेषताओं की समानता के आधार पर समूहों (clusters) में वर्गीकृत करने के लिए किया जाता है।
लक्ष्य यह है कि एक ही समूह के भीतर की वस्तुएँ एक-दूसरे से अधिक समान हों और विभिन्न समूहों की वस्तुएँ भिन्न हों।
उदाहरण: बाजार विभाजन के लिए ग्राहकों को उनकी खरीद की आदतों और जनसांख्यिकी के आधार पर अलग-अलग खंडों में बाँटना।

4. विभेदक विश्लेषण (Discriminant Analysis):

इसका उपयोग स्वतंत्र चरों के एक सेट के आधार पर किसी अवलोकन को पूर्व-निर्धारित समूहों में से किसी एक में वर्गीकृत करने या भविष्यवाणी करने के लिए किया जाता है।
यह प्रतिगमन के समान है, लेकिन यहाँ आश्रित चर गुणात्मक (categorical) होता है।
उदाहरण: वित्तीय चरों के आधार पर यह भविष्यवाणी करना कि कोई ऋण आवेदक ‘अच्छा ऋण जोखिम’ होगा या ‘खराब ऋण जोखिम’।

5. बहुचर प्रसरण विश्लेषण (Multivariate Analysis of Variance – MANOVA):

यह ANOVA का विस्तार है। इसका उपयोग यह परीक्षण करने के लिए किया जाता है कि क्या दो या दो से अधिक समूहों के बीच एक से अधिक आश्रित चरों के माध्यों में सांख्यिकीय रूप से महत्वपूर्ण अंतर हैं।
यह एक ही समय में कई आश्रित चरों पर समूह के प्रभावों का विश्लेषण करता है।
उदाहरण: यह जांचना कि क्या तीन अलग-अलग शिक्षण विधियों का छात्रों के गणित और विज्ञान दोनों के स्कोर पर प्रभाव पड़ता है।

6. विहित सहसंबंध विश्लेषण (Canonical Correlation Analysis):

यह तकनीक चरों के दो सेटों के बीच संबंध का विश्लेषण करती है। इसका उद्देश्य चरों के प्रत्येक सेट के रैखिक संयोजनों को खोजना है जो एक दूसरे के साथ अधिकतम सहसंबद्ध हों।
उदाहरण: मनोवैज्ञानिक चरों (जैसे चिंता, प्रेरणा) के एक सेट और शैक्षणिक प्रदर्शन चरों (जैसे ग्रेड, उपस्थिति) के दूसरे सेट के बीच संबंध का अध्ययन करना।

Q10. वे तकनीकें समझाइए जिनसे गुणात्मक आँकड़ों का विश्लेषण कर सकते हैं।

Ans.

गुणात्मक आँकड़े (Qualitative Data) गैर-संख्यात्मक जानकारी होते हैं, जो अक्सर पाठ, ऑडियो या वीडियो के रूप में होते हैं। ये आँकड़े लोगों के अनुभवों, विचारों, भावनाओं और व्यवहारों की गहरी समझ प्रदान करते हैं। गुणात्मक आँकड़ों का विश्लेषण संख्यात्मक आँकड़ों से भिन्न होता है, क्योंकि इसमें व्याख्या और पैटर्न की पहचान पर अधिक जोर दिया जाता है।

गुणात्मक आँकड़ों का विश्लेषण करने की प्रमुख तकनीकें निम्नलिखित हैं:

1. विषय-वस्तु विश्लेषण (Content Analysis):

यह गुणात्मक डेटा का विश्लेषण करने के लिए सबसे आम तकनीकों में से एक है।
इसमें पाठ, छवियों या मीडिया के भीतर कुछ शब्दों, अवधारणाओं, विषयों, वाक्यांशों या वाक्यों की उपस्थिति, अर्थ और संबंधों को व्यवस्थित रूप से पहचाना, कोडित और वर्गीकृत किया जाता है।
यह गुणात्मक डेटा को मात्रात्मक डेटा में बदल सकता है (जैसे, एक साक्षात्कार में किसी विशेष शब्द का कितनी बार उल्लेख किया गया)।
उदाहरण: ग्राहकों की समीक्षाओं का विश्लेषण करके यह पता लगाना कि वे किसी उत्पाद की किन विशेषताओं के बारे में सबसे अधिक बात करते हैं।

2. विषयगत विश्लेषण (Thematic Analysis):

इस तकनीक का फोकस डेटा के भीतर बार-बार आने वाले पैटर्न या ‘विषयों’ (themes) की पहचान, विश्लेषण और रिपोर्टिंग करना है।
यह विषय-वस्तु विश्लेषण से अधिक व्याख्यात्मक है। शोधकर्ता डेटा को ध्यान से पढ़ता है, प्रारंभिक कोड बनाता है, और फिर इन कोडों को व्यापक विषयों में समूहित करता है जो डेटा के अर्थ को दर्शाते हैं।
उदाहरण: कर्मचारियों के साक्षात्कारों का विश्लेषण करके कार्यस्थल पर तनाव के मुख्य विषयों (जैसे कार्यभार, प्रबंधन शैली, सहकर्मियों से संबंध) की पहचान करना।

3. आधारित सिद्धांत (Grounded Theory):

यह एक आगमनात्मक (inductive) दृष्टिकोण है जहाँ सिद्धांत को डेटा से ही विकसित किया जाता है, न कि पूर्व-मौजूदा सिद्धांत का परीक्षण किया जाता है।
शोधकर्ता डेटा संग्रह और विश्लेषण की एक सतत और पुनरावृत्तीय प्रक्रिया में संलग्न रहता है। कोडिंग, मेमो-लेखन और सैद्धांतिक प्रतिचयन के माध्यम से, डेटा से धीरे-धीरे एक सिद्धांत उभरता है।
यह एक कठोर और समय लेने वाली प्रक्रिया है जिसका उपयोग सामाजिक प्रक्रियाओं को समझने के लिए किया जाता है।

4. कथा विश्लेषण (Narrative Analysis):

यह तकनीक कहानियों और व्यक्तिगत खातों पर केंद्रित है। शोधकर्ता यह जांचता है कि लोग अपने अनुभवों को कैसे बयान करते हैं और उन्हें अर्थ कैसे देते हैं।
यह कथा की संरचना, सामग्री और कार्य का विश्लेषण करता है। यह लोगों के जीवन और पहचान को समझने के लिए उपयोगी है।
उदाहरण: प्रवासियों की जीवन कहानियों का विश्लेषण करके उनके अनुकूलन की प्रक्रिया को समझना।

5. प्रवचन विश्लेषण (Discourse Analysis):

यह सामाजिक संदर्भ में भाषा के उपयोग का अध्ययन है। यह केवल यह नहीं देखता कि क्या कहा गया है, बल्कि यह भी कि यह कैसे कहा गया है।
यह विश्लेषण करता है कि भाषा कैसे शक्ति संबंधों, सामाजिक पहचान और ज्ञान का निर्माण करती है।
उदाहरण: राजनीतिक भाषणों का विश्लेषण करके यह समझना कि राजनेता अपनी विचारधाराओं को बढ़ावा देने के लिए भाषा का उपयोग कैसे करते हैं।

इन तकनीकों में अक्सर कोडिंग (Coding) एक केंद्रीय प्रक्रिया होती है, जिसमें डेटा के खंडों को लेबल या टैग निर्दिष्ट किए जाते हैं ताकि उन्हें व्यवस्थित, पुनर्प्राप्त और विश्लेषण किया जा सके।

Q11. निर्धारण गुणांक की संकल्पना और उसके निहितार्थों की व्याख्या कीजिए।

Ans.

निर्धारण गुणांक की संकल्पना (Concept of Coefficient of Determination):

निर्धारण गुणांक (Coefficient of Determination) , जिसे R-स्क्वायर (R²) के रूप में भी जाना जाता है, एक महत्वपूर्ण सांख्यिकीय माप है जो प्रतिगमन विश्लेषण (regression analysis) में उपयोग होता है। यह एक आश्रित चर (dependent variable, Y) के प्रसरण (variance) के उस अनुपात को मापता है जिसे एक या एक से अधिक स्वतंत्र चरों (independent variables, X) द्वारा समझाया या पूर्वानुमानित किया जा सकता है।

मान: R² का मान 0 और 1 के बीच होता है (या प्रतिशत के रूप में 0% से 100%)।
गणना: सरल रैखिक प्रतिगमन में, यह सहसंबंध गुणांक (r) का वर्ग होता है (अर्थात, R² = r²)। बहु प्रतिगमन में, इसकी गणना अधिक जटिल होती है।
व्याख्या:
- R² = 0 का अर्थ है कि स्वतंत्र चर आश्रित चर में किसी भी प्रसरण की व्याख्या नहीं करता है। मॉडल पूरी तरह से अनुपयुक्त है।
- R² = 1 का अर्थ है कि स्वतंत्र चर आश्रित चर में सभी प्रसरण की व्याख्या करता है। मॉडल पूरी तरह से फिट है, और कोई त्रुटि नहीं है।
- R² = 0.65 का अर्थ है कि आश्रित चर (Y) में 65% भिन्नता को स्वतंत्र चर(रों) (X) में भिन्नता द्वारा समझाया जा सकता है। शेष 35% भिन्नता अन्य अज्ञात कारकों या यादृच्छिक त्रुटि के कारण है।

संक्षेप में, R² यह बताता है कि प्रतिगमन मॉडल डेटा को कितनी अच्छी तरह से ‘फिट’ (goodness-of-fit) करता है। एक उच्च R² मान यह इंगित करता है कि मॉडल के पूर्वानुमान अधिक सटीक हैं क्योंकि मॉडल डेटा के एक बड़े हिस्से की व्याख्या करने में सक्षम है।

निर्धारण गुणांक के निहितार्थ (Implications):

1. मॉडल की व्याख्यात्मक शक्ति का आकलन: R² का मुख्य उपयोग किसी प्रतिगमन मॉडल की व्याख्यात्मक शक्ति (explanatory power) का मूल्यांकन करना है। यह शोधकर्ताओं को यह समझने में मदद करता है कि उनके द्वारा चुने गए स्वतंत्र चर आश्रित चर को समझाने में कितने प्रभावी हैं। एक उच्च R² वाले मॉडल को आमतौर पर कम R² वाले मॉडल से बेहतर माना जाता है।

2. गुडनेस-ऑफ-फिट माप: R² एक ‘गुडनेस-ऑफ-फिट’ माप के रूप में कार्य करता है। यह बताता है कि देखे गए डेटा बिंदु प्रतिगमन रेखा के कितने करीब हैं। यदि R² अधिक है, तो डेटा बिंदु रेखा के करीब होते हैं, जो एक बेहतर फिट का संकेत देता है।

3. पूर्वानुमान की सटीकता का संकेत: हालांकि यह सीधे तौर पर पूर्वानुमान की सटीकता को नहीं मापता है, लेकिन एक उच्च R² यह संकेत देता है कि मॉडल में बेहतर पूर्वानुमान क्षमता होने की संभावना है, क्योंकि यह आश्रित चर में अधिकांश भिन्नता को पकड़ लेता है।

महत्वपूर्ण चेतावनियाँ और सीमाएँ:

उच्च R² का मतलब हमेशा एक अच्छा मॉडल नहीं होता है: एक उच्च R² यह गारंटी नहीं देता है कि मॉडल अच्छा है। मॉडल में अन्य समस्याएँ हो सकती हैं, जैसे कि ओमिटेड वैरिएबल बायस या हेटेरोस्केडैस्टिसिटी।
R² कारण-प्रभाव को इंगित नहीं करता है: एक उच्च R² केवल यह दर्शाता है कि चरों के बीच एक मजबूत संबंध है, यह यह साबित नहीं करता है कि स्वतंत्र चर आश्रित चर में परिवर्तन का कारण बनता है।
समायोजित R² (Adjusted R²): जब एक मॉडल में नए स्वतंत्र चर जोड़े जाते हैं, तो R² का मान हमेशा बढ़ता है या समान रहता है, भले ही नया चर महत्वहीन हो। इस समस्या को दूर करने के लिए, समायोजित R² का उपयोग किया जाता है। यह मॉडल में चरों की संख्या के लिए दंडित करता है और मॉडल की व्याख्यात्मक शक्ति का अधिक सटीक मूल्यांकन प्रदान करता है।
क्षेत्र-विशिष्ट व्याख्या: R² का ‘अच्छा’ मान क्या है, यह अध्ययन के क्षेत्र पर निर्भर करता है। भौतिक विज्ञानों में, 0.90 से ऊपर का R² आम हो सकता है, जबकि सामाजिक विज्ञानों में, जहाँ मानव व्यवहार शामिल होता है, 0.30 का R² भी महत्वपूर्ण माना जा सकता है।

Q12. निम्नलिखित में से किन्हीं तीन पर संक्षिप्त टिप्पणियाँ लिखिए : (क) विश्वास्यता अन्तराल (ख) केन्द्रीय सीमा प्रमेय (ग) यादृच्छिक चर (घ) मानक त्रुटि (ङ) आँकड़ों के प्रकार/भेद

Ans.

(क) विश्वास्यता अन्तराल (Confidence Interval)

एक विश्वास्यता अंतराल एक सांख्यिकीय अनुमान है जो एक अज्ञात समष्टि प्राचल (population parameter) जैसे कि माध्य या अनुपात के लिए संभावित मानों की एक सीमा प्रदान करता है। यह एक बिंदु अनुमान (point estimate) से अधिक जानकारीपूर्ण होता है। इसे एक विश्वास्यता स्तर (confidence level), जैसे 95% या 99%, के साथ व्यक्त किया जाता है।

उदाहरण के लिए, एक 95% विश्वास्यता अंतराल का अर्थ है कि यदि हम एक ही समष्टि से बार-बार प्रतिदर्श लेते हैं और प्रत्येक प्रतिदर्श के लिए एक अंतराल की गणना करते हैं, तो उन अंतरालों में से लगभग 95% में वास्तविक समष्टि प्राचल शामिल होगा। यह हमें हमारे प्रतिदर्श अनुमान में अनिश्चितता की मात्रा का एक माप देता है। एक संकरा अंतराल अधिक सटीक अनुमान का संकेत देता है, जबकि एक चौड़ा अंतराल कम सटीकता का संकेत देता है।

(ख) केन्द्रीय सीमा प्रमेय (Central Limit Theorem)

केन्द्रीय सीमा प्रमेय (CLT) सांख्यिकी के सबसे महत्वपूर्ण प्रमेयों में से एक है। यह बताता है कि यदि हम किसी भी प्रकार के वितरण वाली समष्टि से पर्याप्त रूप से बड़े आकार (आमतौर पर n > 30) के प्रतिदर्श लेते हैं, तो उन प्रतिदर्शों के माध्यों का वितरण (sampling distribution of the mean) लगभग सामान्य (normally distributed) होगा, भले ही मूल समष्टि का वितरण सामान्य न हो।

इस प्रमेय के अनुसार, प्रतिदर्श माध्यों के वितरण का माध्य समष्टि माध्य (μ) के बराबर होगा, और इसका मानक विचलन (जिसे मानक त्रुटि कहा जाता है) σ/√n होगा, जहाँ σ समष्टि का मानक विचलन है और n प्रतिदर्श का आकार है। CLT परिकल्पना परीक्षण और विश्वास्यता अंतराल के निर्माण के लिए आधार प्रदान करता है।

(ग) यादृच्छिक चर (Random Variable)

एक यादृच्छिक चर एक चर है जिसका मान एक यादृच्छिक घटना (random phenomenon) का संख्यात्मक परिणाम होता है। इसके मान को घटना होने से पहले निश्चित रूप से नहीं जाना जा सकता है। यादृच्छिक चरों को दो मुख्य प्रकारों में वर्गीकृत किया जाता है:

1. असंतत यादृच्छिक चर (Discrete Random Variable): यह केवल परिमित या गणनीय संख्या में मान ले सकता है। उदाहरण: एक पासा फेंकने पर प्राप्त संख्या (1, 2, 3, 4, 5, 6), या एक सिक्के को 3 बार उछालने पर चित्त की संख्या (0, 1, 2, 3)।

2. सतत यादृच्छिक चर (Continuous Random Variable): यह एक दिए गए अंतराल के भीतर कोई भी मान ले सकता है। उदाहरण: किसी व्यक्ति की ऊँचाई, किसी शहर का तापमान, या किसी कार द्वारा एक लीटर पेट्रोल में तय की गई दूरी। इन मानों की संख्या अनगिनत होती है।

(घ) मानक त्रुटि (Standard Error)

मानक त्रुटि (SE) एक प्रतिदर्श सांख्यिकी (sampling statistic), जैसे कि प्रतिदर्श माध्य, के प्रतिचयन वितरण (sampling distribution) के मानक विचलन को मापती है। यह दर्शाती है कि प्रतिदर्श सांख्यिकी का मान समष्टि प्राचल (population parameter) के वास्तविक मान से औसतन कितनी दूर होने की उम्मीद है।

सरल शब्दों में, यह प्रतिदर्श से प्राप्त अनुमान की सटीकता का एक माप है। एक छोटी मानक त्रुटि इंगित करती है कि प्रतिदर्श माध्य समष्टि माध्य का एक अधिक सटीक अनुमान है। प्रतिदर्श माध्य की मानक त्रुटि का सूत्र SE = σ/√n है, जहाँ σ समष्टि का मानक विचलन और n प्रतिदर्श का आकार है। जैसे-जैसे प्रतिदर्श का आकार (n) बढ़ता है, मानक त्रुटि कम होती जाती है, जिससे अनुमान अधिक सटीक होता जाता है।

(ङ) आँकड़ों के प्रकार/भेद (Types of Data)

आँकड़ों को उनकी प्रकृति और विशेषताओं के आधार पर विभिन्न तरीकों से वर्गीकृत किया जा सकता है। मुख्य वर्गीकरण निम्नलिखित है:

1. गुणात्मक (Qualitative) या श्रेणीबद्ध (Categorical) आँकड़े: ये गैर-संख्यात्मक विशेषताओं का वर्णन करते हैं।

नामिक (Nominal): इन आँकड़ों को क्रमबद्ध नहीं किया जा सकता। वे केवल श्रेणियों या लेबलों का प्रतिनिधित्व करते हैं। उदाहरण: लिंग (पुरुष, महिला), रक्त समूह (A, B, AB, O), रंग।
क्रमसूचक (Ordinal): इन आँकड़ों को एक तार्किक क्रम में व्यवस्थित किया जा सकता है, लेकिन उनके बीच का अंतर सार्थक नहीं होता। उदाहरण: संतुष्टि स्तर (बहुत असंतुष्ट, असंतुष्ट, संतुष्ट), ग्रेड (A, B, C)।

2. मात्रात्मक (Quantitative) या संख्यात्मक (Numerical) आँकड़े: ये गणनीय या मापने योग्य मात्राओं का प्रतिनिधित्व करते हैं।

असंतत (Discrete): ये केवल विशिष्ट, अलग मान ले सकते हैं और इन्हें गिना जा सकता है। उदाहरण: एक कक्षा में छात्रों की संख्या, एक परिवार में बच्चों की संख्या।
सतत (Continuous): ये एक दिए गए सीमा के भीतर कोई भी मान ले सकते हैं और इन्हें मापा जाता है। उदाहरण: ऊँचाई, वजन, तापमान।

IGNOU BECS-184 Previous Year Solved Question Paper in English

Q1. Explain various measures of central tendency. What is the relationship among mean, median and mode? Point out situations where median and mode are suitable.

Ans. Measures of central tendency are statistical values that represent the center or a typical value of a dataset. These measures summarize a whole set of data with a single value that represents the middle or center of its distribution. The main measures are: 1. Mean: Also known as the ‘average’, the mean is calculated by summing all the values in a dataset and dividing by the total number of values. Formula: Mean (μ or x̄) = (Sum of all values) / (Total number of values) = Σx / n It is the most commonly used measure but is highly sensitive to extreme values (outliers). 2. Median: The median is the middle value in a dataset that has been arranged in ascending or descending order. It divides the data into two equal halves.

If the number of values (n) is odd, the median is the ((n+1)/2)th value.
If the number of values (n) is even, the median is the average of the two middle values.

The median is not affected by outliers, making it a better measure for skewed data.

3.

Mode:

The mode is the value that appears most frequently in a dataset. A dataset can have more than one mode (bimodal, multimodal) or no mode at all. It is primarily used for categorical data.

Relationship among Mean, Median, and Mode:

For a moderately skewed distribution, there is an empirical relationship between these three measures, expressed as:

Mode ≈ 3(Median) – 2(Mean)

Mean – Mode ≈ 3(Mean – Median)

In a symmetrical distribution , Mean = Median = Mode.
In a positively skewed distribution (skewed to the right), Mean > Median > Mode.
In a negatively skewed distribution (skewed to the left), Mean < Median < Mode.

Situations where Median and Mode are suitable:

The Median is suitable when:

The dataset contains extreme values or outliers, such as income distribution or wealth data. The median is not distorted by these extremes and gives a better representation of the center.
The data distribution is skewed.
The data has open-ended classes, such as “over 100”.

The Mode is suitable when:

The data is qualitative or categorical, for instance, finding the most popular car color, a favorite brand, or the candidate with the most votes in an election survey.
In business, it is important to know which product size sells the most (e.g., shoe or shirt size).
The objective is to find the most common or typical value in a distribution.

Q2. (a) Give a brief account of the methods of primary data collection. (b) Explain the steps involved in planning and organization of a sample survey.

Ans. (a) Methods of Primary Data Collection: Primary data is the data collected for the first time by a researcher specifically for their research purpose. It is original and fresh. The main methods for its collection are: 1. Direct Personal Interview: The investigator contacts the respondents directly and asks questions face-to-face. This method provides reliable and detailed information but is expensive and time-consuming. 2. Indirect Oral Investigation: This method is used when direct contact with respondents is not possible. Information is collected from persons or witnesses who are expected to possess relevant information about the subject. 3. Questionnaire Method: A list of questions (a questionnaire) is prepared and sent to the respondents to be filled out. This can be done in two ways:

Mailed Questionnaire: Questionnaires are sent by post. This is cheap for covering a large area, but the response rate is often low.
Questionnaire filled by Enumerators: Trained enumerators visit the respondents and fill the questionnaires themselves. This yields high-quality data but is expensive.

4.

Observation Method:

In this method, the researcher directly observes and records the behavior of individuals or events without any interaction. It is useful for behavioral studies but can be subjective.

5.

Telephonic Interview:

The researcher asks questions to the respondents over the telephone. It is quick and less expensive but is limited to those who have a telephone and lacks non-verbal cues.

(b) Steps in Planning and Organization of a Sample Survey:

A sample survey is a systematic process of drawing conclusions about a whole population by studying a small part (a sample) of it. The planning and organization involve the following steps:

1.

Defining the Objectives:

The first and most crucial step is to clearly define the objectives of the survey. It is essential to know what information is to be collected and why.

2.

Defining the Population:

Clearly define the group to be studied. For instance, if you are studying the spending habits of students, the population could be all students of a particular university.

3.

Determining the Sampling Frame:

This is a list of all units in the population from which the sample will be selected, such as a voter list or a telephone directory. A good frame should be complete and up-to-date.

4.

Selection of Sampling Method:

Decide the method for selecting the sample. It could be probability sampling (e.g., simple random, stratified) or non-probability sampling (e.g., convenience, quota). The choice of method depends on accuracy, cost, and time.

5.

Determining the Sample Size:

Decide how many people will be included in the survey. The sample size must be large enough for the results to be reliable, but not so large that the cost and time become prohibitive.

6.

Designing and Pre-testing the Questionnaire:

Prepare an effective questionnaire based on the survey’s objectives. Questions should be clear, unbiased, and easy to understand. It is essential to pre-test the questionnaire on a small group to identify any problems before the main survey.

7.

Training and Organizing Field Staff:

If enumerators are being used, it is necessary to give them proper training so they can collect data in a standardized manner.

8.

Collection of Data:

Collect the data from the sample using the chosen method.

9.

Processing and Analysis of Data:

The collected data is checked, coded, tabulated, and then analyzed using statistical techniques.

10.

Report Writing:

Finally, present the findings in a detailed report, which includes the methodology, key findings, and recommendations.

Q3. Distinguish between null hypothesis and alternative hypothesis. Explain the steps you would follow in testing of a hypothesis.

Ans. Distinction between Null and Alternative Hypothesis: In hypothesis testing, two competing hypotheses are formulated: the null hypothesis and the alternative hypothesis. Null Hypothesis (H₀):

It is a statement of no relationship between variables, no difference between groups, or a claim of the status quo.
It is the hypothesis that the researcher attempts to disprove or reject based on statistical evidence.
It is always expressed as an equality, such as μ = 100, or p₁ = p₂.
Example: “A new drug has no effect on blood pressure.”

Alternative Hypothesis (H₁ or Hₐ):

It is a statement that contradicts the null hypothesis. It is the claim that the researcher wants to prove.
It suggests a relationship between variables, a difference between groups, or the presence of an effect.
It is expressed as an inequality (≠), or a directional claim (< or >).
Example: “A new drug lowers blood pressure.” (μ < μ₀) or “A new drug has an effect on blood pressure.” (μ ≠ μ₀)

In essence,

H₀

is the ‘no effect’ position, while

H₁

is the ‘effect’ the researcher is looking for. The goal of hypothesis testing is to determine if the sample data provides enough evidence to reject H₀ in favor of H₁.

Steps in Testing of a Hypothesis:

The process of testing a hypothesis involves the following systematic steps:

1.

State the Null and Alternative Hypotheses (State H₀ and H₁):

First, clearly define the null (H₀) and alternative (H₁) hypotheses based on the research question. The alternative hypothesis can be one-tailed or two-tailed.

2.

Choose the Significance Level (α):

The significance level (α) represents the probability of rejecting the null hypothesis when it is actually true (a Type I error). Common values for α are 0.05 (5%), 0.01 (1%), or 0.10 (10%).

3.

Select the Appropriate Test Statistic:

Choose the correct statistical test based on the nature of the data, sample size, and the hypothesis. For example, z-test, t-test, chi-square (χ²) test, or F-test.

4.

Formulate the Decision Rule:

Determine the ‘rejection region’ or ‘critical value’ based on the theoretical distribution of the test statistic. If the calculated test statistic falls into this region, H₀ will be rejected.

5.

Compute the Test Statistic:

Calculate the value of the chosen test statistic using the sample data.

6.

Make a Decision:

Compare the computed statistic to the critical value (or compare the p-value to α).

If the computed value falls in the rejection region (or if p-value ≤ α), then reject the null hypothesis .
If the computed value does not fall in the rejection region (or if p-value > α), then fail to reject the null hypothesis .

7.

Interpret the Conclusion:

Finally, interpret the statistical decision in the context of the research question. State what your findings mean in practical terms.

Q4. What is a composite index? Explain how a composite index is constructed.

Ans. Composite Index: A composite index is a statistical measure that combines a group of related variables or indicators into a single numerical value to measure changes over time, geographical location, or other characteristics. It simplifies a complex concept and makes it easy to monitor by aggregating multiple individual indicators into one score. The purpose is to present an at-a-glance summary of the overall trend or change in several variables. Examples include:

Consumer Price Index (CPI): Measures changes in the cost of living by combining the prices of various goods and services (e.g., food, clothing, housing).
Human Development Index (HDI): Measures a country’s social and economic development level by combining indicators like life expectancy, education, and per capita income.
Stock Market Indices (e.g., Sensex, Nifty): Reflect the overall performance of the market by combining the stock values of several major companies.

Steps in Constructing a Composite Index:

Constructing a composite index is a complex process involving several decisions. The main steps are as follows:

1.

Define the Purpose and Scope:

First, it is crucial to clearly define what the index is intended to measure. For example, will it measure the price level, industrial production, or social welfare? Its scope (geographical, sectoral) must also be determined.

2.

Selection of Items:

Carefully select the items or indicators to be included in the index. These items must be representative of the concept being measured. For CPI, for example, goods and services consumed by a typical family are selected.

3.

Selection of a Base Period:

A base period must be chosen for comparison. This period should be normal and recent, without any major abnormalities (like war, famine). The index value for the base period is always 100.

4.

Collection of Data:

Collect data for the selected items, such as prices, quantities, or other values. This data should be obtained from reliable sources.

5.

Assignment of Weights:

Each item in the index is assigned a weight according to its relative importance. For example, in the CPI, expenditure on food is given more weight than expenditure on clothing because a family spends a larger portion of its income on food. Weights can be based on quantity or value.

6.

Selection of an Appropriate Formula:

Choose a suitable average or formula for calculating the index. Common formulas for weighted indices are:

Laspeyres’ Index: Uses base period quantities as weights.
Paasche’s Index: Uses current period quantities as weights.
Fisher’s Ideal Index: It is the geometric mean of Laspeyres’ and Paasche’s indices and is considered statistically superior.

7.

Calculation of the Index:

Using the chosen formula and weights, calculate the final index value. This value represents the overall change relative to the base period.

Q5. Explain the difference between single bar diagram and histogram. Construct a histogram from the following data: Scores in numerical ability / Frequency 0-20 / 5 20-30 / 30 30-40 / 50 40-50 / 35 50-60 / 20

Ans. Difference between a Simple Bar Diagram and a Histogram: Both simple bar diagrams and histograms are graphical methods of representing data, but they are used for different types of data and have significant differences in their construction.

Feature	Simple Bar Diagram	Histogram
Type of Data	Used for discrete or categorical data. E.g., population of different countries, production in different years.	Used for continuous data that is grouped into class intervals.
Space between Bars	There are equal gaps between the bars. Each bar represents a distinct category.	The bars are adjacent to each other with no gaps, as it represents a continuous series.
Width of Bars	The width of the bars has no special significance, but they should all be of equal width.	The width of the bars represents the width of the class interval. It can be unequal.
X-axis	The X-axis displays discrete categories or discrete values.	The X-axis is a continuous number line representing the class intervals.
Area vs. Height	The height (or length) of the bar is proportional to the frequency.	The area of the bar is proportional to the frequency. If class intervals are equal, the height is proportional to the frequency.

Construction of a Histogram:

The given data is continuous and grouped into class intervals, so a histogram can be constructed.

(Note: The question paper text uses “प्रायिकता” which means “Probability” for the second column in Hindi. This is likely a typo for “आवृत्ति” which means “Frequency”. We will proceed with the English term “Frequency”.)

Data:

Scores in numerical ability (Class Interval): 0-20, 20-30, 30-40, 40-50, 50-60
Frequency: 5, 30, 50, 35, 20

Steps to construct the histogram:

1. Draw two axes on a graph paper: a horizontal axis (X-axis) and a vertical axis (Y-axis).

2. On the

X-axis

, mark the ‘Scores in numerical ability’ using a suitable scale. Here, we will mark 0, 10, 20, 30, 40, 50, 60 at equal distances. This axis will represent the class intervals.

3. On the

Y-axis

, mark the ‘Frequency’ using a suitable scale. Since the maximum frequency is 50, we can mark values from 0 to 55 at intervals of 5 or 10 (e.g., 0, 10, 20, 30, 40, 50).

4. Now, draw a rectangular bar for each class interval. There will be no gaps between the bars as the data is continuous.

0-20: Draw a bar with width from 0 to 20 on the X-axis and height corresponding to 5 on the Y-axis.
20-30: Draw a bar with width from 20 to 30 on the X-axis and a height of 30. This will be adjacent to the first bar.
30-40: Draw a bar with width from 30 to 40 on the X-axis and a height of 50.
40-50: Draw a bar with width from 40 to 50 on the X-axis and a height of 35.
50-60: Draw a bar with width from 50 to 60 on the X-axis and a height of 20.

The resulting graph will be the histogram for the given frequency distribution. The graph can be titled “Histogram of Scores in Numerical Ability”.

Q6. Explain how p-value method is used in hypothesis testing.

Ans. The p-value is a crucial concept in hypothesis testing. It is a probability value that measures the likelihood of obtaining the observed sample results, or results more extreme, assuming the null hypothesis (H₀) is actually true. In simple terms, it’s a measure of the inconsistency between your data and the null hypothesis. Use of the p-value Method in Hypothesis Testing: The p-value method is an alternative to the traditional critical value method and is widely used in modern statistical software. The steps for using this method are as follows: 1. State the Hypotheses: First, clearly define the null hypothesis (H₀) and the alternative hypothesis (H₁). H₀ represents the ‘no effect’ position, while H₁ represents the effect the researcher wants to prove. 2. Choose a Significance Level (α): Select a significance level (alpha), which is the maximum acceptable probability of committing a Type I error (rejecting a true H₀). Common values are 0.05, 0.01, or 0.10. This acts as a threshold for making a decision. 3. Calculate the Test Statistic: Based on your sample data, compute the appropriate test statistic (e.g., t-statistic, z-statistic, chi-square statistic). This value measures how far your sample result is from the expected result under H₀. 4. Calculate the p-value: Based on the calculated test statistic, find the p-value. The p-value is the area (or probability) in the tail(s) of the test statistic’s distribution beyond your calculated value.

One-tailed test: The p-value is the area in one tail of the distribution.
Two-tailed test: The p-value is the total area in both tails of the distribution.

5.

Apply the Decision Rule:

Compare the p-value with the significance level (α).

If p-value ≤ α: This means the observed result is so rare (if H₀ were true) that we reject the null hypothesis . The result is considered “statistically significant”. This provides strong evidence in favor of the alternative hypothesis.
If p-value > α: This means the observed result could have occurred by chance (if H₀ were true). Therefore, we fail to reject the null hypothesis . The result is considered “not statistically significant”.

Example:

Suppose a researcher wants to test if a new teaching method improves student scores. H₀ would be “no improvement” and H₁ would be “there is improvement”. They choose α = 0.05. After analysis, they find a p-value = 0.023.

Since the

p-value (0.023) ≤ α (0.05)

, the researcher would reject the null hypothesis. They would conclude that there is sufficient statistical evidence that the new teaching method is effective.

An advantage of the p-value method is that it not only tells you if the result is significant but also indicates the ‘strength’ of the significance. A very small p-value (e.g., 0.001) indicates very strong evidence against H₀.

Q7. Distinguish between correlation and causation. Explain how cause and effect between two variables (X = cause and Y = effect) can be analysed.

Ans. Distinction between Correlation and Causation: Correlation:

Correlation refers to a statistical relationship between two variables. It measures the extent to which two variables change together.
It indicates the direction (positive or negative) and strength (strong or weak) of the relationship.
Positive Correlation: As one variable increases, the other also tends to increase (e.g., exercise and calories burned).
Negative Correlation: As one variable increases, the other tends to decrease (e.g., temperature and sales of winter clothes).
The correlation coefficient (r) ranges from -1 to +1, where +1 is a perfect positive correlation, -1 is a perfect negative correlation, and 0 is no correlation.
Key Point: Correlation does not imply causation. Just because two variables move together does not prove that one causes the other.

Causation:

Causation, or causality, refers to a situation where one event (the cause) is the result of the occurrence of another event (the effect).
Here, a change in one variable directly causes a change in another variable.
Example: Smoking (cause) leads to lung cancer (effect). Raining (cause) makes the ground wet (effect).
Establishing causation is much more difficult than establishing correlation.

Example:

There is a strong positive correlation between ice cream sales and the number of drowning incidents in summer. But this does not mean eating ice cream causes drowning. There is a third variable,

hot weather (a confounding variable)

, that causes both. In hot weather, people eat more ice cream and also go swimming more, which increases drowning incidents.

Analysis of Cause and Effect (X = cause and Y = effect):

To establish that variable X causes variable Y, correlation alone is not enough. The following criteria must be met to analyze and establish a causal relationship:

1.

Establish Correlation:

The first step is to show that a statistically significant correlation exists between X and Y. If there is no relationship, there can be no causation. This can be analyzed through correlation coefficients or regression analysis.

2.

Temporal Precedence:

The cause must precede the effect in time. This means the change in variable X must occur

before

the change in variable Y. Longitudinal studies, where variables are measured over time, can help establish this criterion.

3.

Rule out Alternative Explanations:

This is the most challenging step. We must ensure that the observed relationship between X and Y is not due to a third, confounding variable. There are two main approaches for this:

Controlled Experiments: This is the ‘gold standard’ for establishing causation. In an experiment, the researcher deliberately manipulates variable X (the independent variable) while holding all other potential variables constant, and then measures its effect on variable Y (the dependent variable). Random assignment helps ensure that the treatment and control groups are initially similar.
Advanced Statistical Control: When experiments are not feasible (e.g., in economics or sociology), statistical techniques like multiple regression analysis are used. In this method, the researcher mathematically ‘controls’ for the effects of other potential confounding variables to assess the pure relationship between X and Y.

4.

A Logical Mechanism:

There should be a plausible theory or mechanism that explains how X causes Y. This theory adds credibility to the analysis.

Q8. Give a brief account of the sources and the limitations of secondary data.

Ans. Secondary Data: Secondary data is data that has already been collected and processed by someone else or an organization for some other purpose. The researcher uses this data to answer their own research question. This is in contrast to primary data, which the researcher collects themselves. Secondary data can be easily accessible, inexpensive, and time-saving. Sources of Secondary Data: There are numerous sources of secondary data, which can be broadly classified into published and unpublished sources. A. Published Sources:

Government Publications: Central and state governments collect and publish various types of data.
- Census of India: Detailed information on population, literacy, sex ratio, etc.
- National Sample Survey Organisation (NSSO): Surveys on employment, unemployment, consumption expenditure, health, etc.
- Reserve Bank of India (RBI): Bulletins and reports on banking, finance, inflation, and other economic indicators.
- Central Statistical Office (CSO): Data on national income, industrial production, etc.
International Publications:
- World Bank: Data on development indicators for various countries.
- International Monetary Fund (IMF): International financial statistics.
- United Nations Organisation (UNO): Global data on demographic, social, and economic subjects.
Trade, Industry, and Financial Publications: Various industry associations (like FICCI, CII) and stock exchanges publish data on their respective sectors.
Reports of Research Institutions and Universities: Academic institutions publish findings from research conducted on various subjects.
Journals, Newspapers, and Websites: These sources provide data on current events and various topics.

B. Unpublished Sources:

This includes data that is collected but not published. It can be found in the internal records of government departments, private firms, trade unions, and other organizations. Access to these might be difficult.

Limitations of Secondary Data:

Unsuitability: Since the data was collected for another purpose, it may not fit the needs of the current research. The definitions, units, or coverage might be different.
Outdated Data: The data may be old and may not reflect the current situation.
Lack of Reliability: The reliability of the source that collected the data may be questionable. It can be difficult to know the method, bias, and level of accuracy with which the data was collected.
Insufficient Information: The data may lack the detail or classification required for the current research. For example, if you need rural and urban income data, but the available data only shows total income.
Bias: The organization that collected the data may have had its own bias, which could affect the impartiality of the data.

Therefore, before using secondary data, a researcher must carefully evaluate its suitability, reliability, adequacy, and accuracy.

Q9. Briefly describe various techniques of analyzing multivariate data.

Ans. Multivariate data refers to data where more than two variables are measured for each observation. Multivariate analysis techniques aim to understand the complex relationships among these variables simultaneously. These techniques provide a more realistic representation of reality compared to univariate and bivariate analysis. The main techniques for analyzing multivariate data are as follows: 1. Multiple Regression Analysis:

This is one of the most widely used multivariate techniques.
It is used to study the combined effect of two or more independent variables on a single dependent variable and to predict the value of the dependent variable.
Example: Understanding a person’s income (dependent variable) based on their education, experience, age, and gender (independent variables).

2.

Factor Analysis:

Its purpose is to reduce a large number of interrelated variables into a smaller number of underlying, unobservable factors.
It helps in simplifying the data by identifying patterns and structure within the data.
Example: Identifying an underlying factor called “customer satisfaction” from several survey questions (like quality, price, service).

3.

Cluster Analysis:

This technique is used to classify observations or objects into groups (clusters) based on the similarity of their properties or characteristics.
The goal is that objects within the same group are more similar to each other than to objects in different groups.
Example: Dividing customers into different segments for market segmentation based on their purchasing habits and demographics.

4.

Discriminant Analysis:

It is used to classify or predict an observation into one of several pre-defined groups based on a set of independent variables.
It is similar to regression, but here the dependent variable is categorical.
Example: Predicting whether a loan applicant will be a ‘good credit risk’ or a ‘bad credit risk’ based on financial variables.

5.

Multivariate Analysis of Variance (MANOVA):

This is an extension of ANOVA. It is used to test whether there are statistically significant differences in the means of more than one dependent variable between two or more groups.
It analyzes the effects of groups on multiple dependent variables at the same time.
Example: Examining whether three different teaching methods have an effect on students’ scores in both math and science.

6.

Canonical Correlation Analysis:

This technique analyzes the relationship between two sets of variables. Its goal is to find linear combinations of each set of variables that are maximally correlated with each other.
Example: Studying the relationship between a set of psychological variables (e.g., anxiety, motivation) and a set of academic performance variables (e.g., grades, attendance).

Q10. Explain the techniques by which qualitative data can be analysed.

Ans. Qualitative data is non-numerical information, often in the form of text, audio, or video. This data provides deep insights into people’s experiences, opinions, feelings, and behaviors. The analysis of qualitative data is different from that of numerical data, as it places more emphasis on interpretation and pattern identification. The main techniques for analyzing qualitative data are as follows: 1. Content Analysis:

This is one of the most common techniques for analyzing qualitative data.
It involves systematically identifying, coding, and categorizing the presence, meanings, and relationships of certain words, concepts, themes, phrases, or sentences within texts, images, or media.
It can turn qualitative data into quantitative data (e.g., how many times a particular word was mentioned in an interview).
Example: Analyzing customer reviews to find out which features of a product they talk about the most.

2.

Thematic Analysis:

The focus of this technique is on identifying, analyzing, and reporting recurring patterns or ‘themes’ within the data.
It is more interpretive than content analysis. The researcher carefully reads the data, creates initial codes, and then groups these codes into broader themes that capture the meaning of the data.
Example: Analyzing employee interviews to identify the main themes of stress at the workplace (e.g., workload, management style, peer relations).

3.

Grounded Theory:

This is an inductive approach where theory is developed from the data itself, rather than testing a pre-existing theory.
The researcher engages in a constant and iterative process of data collection and analysis. Through coding, memo-writing, and theoretical sampling, a theory gradually emerges from the data.
It is a rigorous and time-consuming process used to understand social processes.

4.

Narrative Analysis:

This technique focuses on stories and personal accounts. The researcher examines how people narrate their experiences and how they make sense of them.
It analyzes the structure, content, and function of the narrative. It is useful for understanding people’s lives and identities.
Example: Analyzing the life stories of immigrants to understand their process of adaptation.

5.

Discourse Analysis:

This is the study of language use in its social context. It looks not just at what is said, but also at how it is said.
It analyzes how language constructs power relations, social identities, and knowledge.
Example: Analyzing political speeches to understand how politicians use language to promote their ideologies.

In these techniques,

Coding

is often a central process, involving assigning labels or tags to segments of data to organize, retrieve, and analyze them.

Q11. Explain the concept and implications of the coefficient of determination.

Ans. Concept of the Coefficient of Determination: The Coefficient of Determination , also known as R-squared (R²) , is a key statistical measure used in regression analysis. It measures the proportion of the variance in a dependent variable (Y) that can be explained or predicted by one or more independent variables (X).

Value: The value of R² lies between 0 and 1 (or 0% to 100% when expressed as a percentage).
Calculation: In simple linear regression, it is the square of the correlation coefficient (r) (i.e., R² = r²). In multiple regression, its calculation is more complex.
Interpretation:
- An R² = 0 means that the independent variable(s) do not explain any of the variance in the dependent variable. The model is a poor fit.
- An R² = 1 means that the independent variable(s) explain all of the variance in the dependent variable. The model is a perfect fit, and there is no error.
- An R² = 0.65 means that 65% of the variation in the dependent variable (Y) can be explained by the variation in the independent variable(s) (X). The remaining 35% of the variation is due to other unknown factors or random error.

In short, R² tells us how well the regression model ‘fits’ the data (goodness-of-fit). A higher R² value indicates that the model’s predictions are more accurate because the model is able to explain a larger portion of the data.

Implications of the Coefficient of Determination:

1.

Assessment of Explanatory Power:

The primary use of R² is to evaluate the explanatory power of a regression model. It helps researchers understand how effective their chosen independent variables are in explaining the dependent variable. A model with a high R² is generally considered better than one with a low R².

2.

Goodness-of-Fit Measure:

R² acts as a ‘goodness-of-fit’ measure. It indicates how close the observed data points are to the fitted regression line. If R² is high, the data points are close to the line, indicating a better fit.

3.

Indication of Prediction Accuracy:

While it doesn’t directly measure prediction accuracy, a high R² suggests that the model is likely to have better predictive ability, as it captures most of the variation in the dependent variable.

Important Cautions and Limitations:

High R² does not always mean a good model: A high R² does not guarantee that the model is good. The model might have other problems, such as omitted variable bias or heteroscedasticity.
R² does not indicate causation: A high R² only shows a strong relationship between variables; it does not prove that the independent variable(s) cause the changes in the dependent variable.
Adjusted R²: When new independent variables are added to a model, the R² value always increases or stays the same, even if the new variable is insignificant. To overcome this problem, the Adjusted R² is used. It penalizes for the number of variables in the model and provides a more accurate assessment of the model’s explanatory power.
Field-specific Interpretation: What constitutes a ‘good’ R² value depends on the field of study. In the physical sciences, an R² above 0.90 might be common, whereas in the social sciences, where human behavior is involved, an R² of 0.30 might be considered significant.

Q12. Write short notes on any three of the following: (a) Confidence interval (b) Central limit theorem (c) Random variable (d) Standard error (e) Types of data

Ans. (a) Confidence Interval A confidence interval is a statistical estimate that provides a range of plausible values for an unknown population parameter, such as a mean or proportion. It is more informative than a single point estimate. It is expressed with a confidence level, such as 95% or 99%. For example, a 95% confidence interval means that if we were to take repeated samples from the same population and calculate an interval for each sample, about 95% of those intervals would contain the true population parameter. It gives us a measure of the uncertainty in our sample estimate. A narrower interval indicates a more precise estimate, while a wider interval indicates less precision. (b) Central Limit Theorem The Central Limit Theorem (CLT) is one of the most important theorems in statistics. It states that if we take sufficiently large samples (usually n > 30) from a population with any type of distribution, the distribution of the means of those samples (the sampling distribution of the mean) will be approximately normally distributed, regardless of the distribution of the original population. According to this theorem, the mean of the sampling distribution of the mean will be equal to the population mean (μ), and its standard deviation (called the standard error) will be σ/√n, where σ is the population standard deviation and n is the sample size. The CLT provides the basis for hypothesis testing and the construction of confidence intervals. (c) Random Variable A random variable is a variable whose value is a numerical outcome of a random phenomenon. Its value cannot be known with certainty before the event occurs. Random variables are classified into two main types: 1. Discrete Random Variable: It can only take a finite or countable number of values. Examples: The number obtained on rolling a die (1, 2, 3, 4, 5, 6), or the number of heads in 3 coin tosses (0, 1, 2, 3). 2. Continuous Random Variable: It can take any value within a given interval. Examples: The height of a person, the temperature of a city, or the distance a car travels on one litre of petrol. The number of possible values is uncountable. (d) Standard Error The Standard Error (SE) measures the standard deviation of a sampling distribution of a statistic, most commonly the sample mean. It indicates how much the value of the sample statistic is expected to vary, on average, from the true value of the population parameter. In simple terms, it is a measure of the accuracy of the estimate obtained from a sample. A smaller standard error indicates that the sample mean is a more precise estimate of the population mean. The formula for the standard error of the mean is SE = σ/√n, where σ is the population standard deviation and n is the sample size. As the sample size (n) increases, the standard error decreases, making the estimate more precise. (e) Types of Data Data can be classified in various ways based on its nature and characteristics. The main classification is as follows: 1. Qualitative or Categorical Data: This describes non-numerical characteristics.

Nominal: This data cannot be ordered. They simply represent categories or labels. Examples: Gender (Male, Female), Blood Group (A, B, AB, O), Color.
Ordinal: This data can be arranged in a logical order, but the differences between them are not meaningful. Examples: Satisfaction level (Very Unsatisfied, Unsatisfied, Satisfied), Grades (A, B, C).

2.

Quantitative or Numerical Data:

This represents countable or measurable quantities.

Discrete: This can only take specific, distinct values and can be counted. Examples: Number of students in a class, number of children in a family.
Continuous: This can take any value within a given range and is measured. Examples: Height, weight, temperature.

Download IGNOU previous Year Question paper download PDFs for BECS-184 to improve your preparation. These ignou solved question paper IGNOU Previous Year Question paper solved PDF in Hindi and English help you understand the exam pattern and score better.

IGNOU Previous Year Solved Question Papers (All Courses)

Thanks!

Telegram Channel	Join Now
FaceBook Page	Follow Us
Youtube Channel	Subscribe
WhatsApp Channel	Join Now

IGNOU BECS-184 Solved Question Paper PDF Download