<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Big-Data on cloudmato.com</title><link>https://cloudmato.com/tags/big-data/</link><description>Recent content in Big-Data on cloudmato.com</description><generator>Hugo -- gohugo.io</generator><language>en</language><managingEditor>cloudmato.com</managingEditor><webMaster>cloudmato.com</webMaster><lastBuildDate>Sun, 14 Jun 2026 21:29:11 +0530</lastBuildDate><atom:link href="https://cloudmato.com/tags/big-data/index.xml" rel="self" type="application/rss+xml"/><item><title>What Is Hadoop, and Why It Isn't 10 Microservices on K8s</title><link>https://cloudmato.com/posts/what-is-hadoop-vs-microservices-kubernetes/</link><pubDate>Sun, 14 Jun 2026 21:29:11 +0530</pubDate><author>cloudmato.com</author><guid>https://cloudmato.com/posts/what-is-hadoop-vs-microservices-kubernetes/</guid><description>&lt;p&gt;Someone asked me this exact question last week, and it&amp;rsquo;s a good one because both setups &lt;em&gt;look&lt;/em&gt; the same if you squint. A bunch of machines, some shared storage in the middle, work spread across nodes. So why does one get called &amp;ldquo;big data&amp;rdquo; and the other &amp;ldquo;microservices&amp;rdquo;? Are they just two words for the same cluster? Honestly, no. They&amp;rsquo;re built on opposite assumptions about one thing: &lt;strong&gt;where the data lives and who moves to whom.&lt;/strong&gt;&lt;/p&gt;</description></item><item><title>Apache Spark: What It Is and Why Microservices Can't Replace It</title><link>https://cloudmato.com/posts/apache-spark-vs-scaling-microservices/</link><pubDate>Wed, 03 Jun 2026 17:43:05 +0530</pubDate><author>cloudmato.com</author><guid>https://cloudmato.com/posts/apache-spark-vs-scaling-microservices/</guid><description>&lt;p&gt;The &amp;ldquo;just scale microservices&amp;rdquo; question keeps coming up whenever Spark enters the conversation. It sounds logical — you already have distributed services, just throw more at the problem. But this comparison collapses under a pretty basic question: &lt;em&gt;what kind of problem are you actually solving?&lt;/em&gt;&lt;/p&gt;
&lt;h2 class="header-anchor-wrapper"&gt;It Is Not a Database. Not a Queue.
&lt;a href="#it-is-not-a-database-not-a-queue" class="header-anchor-link"&gt;
&lt;svg
xmlns="http://www.w3.org/2000/svg"
width="1rem" height="1rem" viewBox="0 0 24 24" fill="none"
stroke="currentColor" stroke-width="2" stroke-linecap="round"
stroke-linejoin="round"&gt;
&lt;line x1="4" y1="9" x2="20" y2="9"&gt;&lt;/line&gt;&lt;line x1="4" y1="15" x2="20" y2="15"&gt;&lt;/line&gt;&lt;line x1="10" y1="3" x2="8" y2="21"&gt;&lt;/line&gt;&lt;line x1="16" y1="3" x2="14" y2="21"&gt;&lt;/line&gt;
&lt;/svg&gt;
&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;People come to Spark expecting something like a faster database or a smarter Kafka. Neither is accurate.&lt;/p&gt;</description></item></channel></rss>