Technology News from Around the World, Instantly on Oracnoos!

Materialized Views in Data Stream Processing With RisingWave - Related to logs, information, your, enhances, download

Cross-Modal Retrieval: Why It Matters for Multimodal AI

Cross-Modal Retrieval: Why It Matters for Multimodal AI

There has been a lot of discussion around multimodal AI in recent times — how these systems can be built, open source options, small-scale alternatives, as well as tools for addressing fairness and bias in multimodal AI.

With its ability to simultaneously process different data types (think text, image, audio, video and more), the continuing development of multimodal AI represents the next step that would help to further enhance a wide range of tools — including those for generative AI and autonomous agentic AI.

To that end, improving how machines can find relevant information within this growing range of diverse data types is vital to further improving the capabilities of multimodal AI.

This could mean using a text prompt to search for a specific photo or video (text-image, text-video), or vice versa — a process that many of us are already familiar with.

The goal of cross-modal retrieval is to extract pertinent information across various types of data. However, this can be challenging due to the differences in data structures, feature spaces and how that information might be semantically portrayed across different modalities.

This results in a misalignment between those various semantic spaces and difficulties for direct comparison — a problem that researchers call the heterogeneous modality gap. Consequently, much of the research in the cross-modal retrieval field centers around finding and establishing shared frameworks for multimodal data, in order to facilitate cross-modal retrieval tasks.

Representation Learning in Cross-Modal Retrieval.

To tackle this problem, most cross-modal retrieval methods will typically use what is called representation learning. This process simplifies various kinds of raw, modal data into patterns — or representations — that a machine can understand so that they can be mapped into a shared space or framework, thus facilitating the extraction of useful information. Representation learning helps to enhance interpretability, uncover hidden aspects and also makes transfer learning easier.

Generally, these representation learning approaches in cross-modal retrieval can be split into two types: real-value retrieval and binary-value retrieval, as well as supervised and unsupervised forms of each retrieval type.

Real-value-based cross-modal retrieval aims to distill low-dimensional, real-value elements of multimodal data, thus retaining more in-depth semantic information.

A common representation space can be shared between different data types, with the most correlated data sitting adjacent to each other within that space.

For many years, one of the most commonly used algorithms for cross-modal retrieval was canonical correlation analysis (CCA), a classical statistical method that extracts functions from the raw data, and then maximizes the correlation between paired representations of cross-modal data — such as images and text — before aligning them in a common subspace in order to facilitate cross-modal retrieval. However, the drawbacks to CCA include significant semantic gaps between different modalities, as it’s best used for capturing statistic relationships, rather than more complex, nonlinear semantic relationships.

While real-value representation learning methods allow different data modalities to be more directly measured, the downside is that this approach requires more storage and computational resources.

Though classifications of real-value retrieval methods vary, they fall into these general categories, which can either be supervised or unsupervised:

Shallow real-value retrieval: Uses statistical analysis techniques to model multimodal data associations.

Uses statistical analysis techniques to model multimodal data associations. Deep real-value retrieval : Involves the learning of attributes, joint representations, complex semantic relationships and patterns across varying data types, using deep neural networks.

: Involves the learning of functions, joint representations, complex semantic relationships and patterns across varying data types, using deep neural networks. RNN (recurrent neural network) models: Used primarily to process sequential and time series data (like text, video), and to combine it with image functions extracted through CNN (convolutional neural network) models.

Used primarily to process sequential and time series data (like text, video), and to combine it with image functions extracted through CNN (convolutional neural network) models. GAN (generative adversarial network): This deep learning architecture uses competing “generator” and “discriminator” components in order to learn the distribution of data. When used in cross-modal retrieval, it enables the model to learn correlations across varying data types.

This deep learning architecture uses competing “generator” and “discriminator” components in order to learn the distribution of data. When used in cross-modal retrieval, it enables the model to learn correlations across varying data types. Graph regularization: Due to its ability to accommodate multiple modalities within an integrated framework, it can capture a wide range of correlations between different forms of data.

Due to its ability to accommodate multiple modalities within an integrated framework, it can capture a wide range of correlations between different forms of data. Transformer methods: Based on an innovative self-attention mechanism, the transformer architecture allows deep learning networks to concurrently process all incoming inputs, making it an effective option for cross-modal retrieval tasks.

Also called hashing-based cross-modal retrieval, this form of representation learning encodes data from different modalities by compressing them into binary code, which is then transposed into a common Hamming binary space for learning, thus enabling more efficient and scalable search, and reduced storage needs, though accuracy and semantic information may be slightly reduced. Yet another advantage of hashing retrieval is that binary hash codes are shorter and more simplified than the original data, which helps to alleviate what computer scientists call the curse of dimensionality.

In both supervised and unsupervised hashing, hash functions are learned via an optimization process that minimizes the discrepancies between the original data and binary codes.

Cross-modal hashing techniques can be divided into three main categories:

Supervised: Uses labeled data to train the hash functions, which helps to preserve the semantic similarities between paired instances of multimodal data while also maximizing the Hamming distance between non-matched instances. Supervised cross-modal hashing can be further classified as either shallow or deep learning-based.

Uses labeled data to train the hash functions, which helps to preserve the semantic similarities between paired instances of multimodal data while also maximizing the Hamming distance between non-matched instances. Supervised cross-modal hashing can be further classified as either shallow or deep learning-based. Unsupervised: Does not use labeled data and instead relies on learning, has functions solely from data distribution. These techniques utilize the correlation between data modalities to learn the relationships between them as encoded in binary form. Similarly, unsupervised methods can also be subdivided into shallow and deep retrieval methods.

Does not use labeled data and instead relies on learning, has functions solely from data distribution. These techniques utilize the correlation between data modalities to learn the relationships between them as encoded in binary form. Similarly, unsupervised methods can also be subdivided into shallow and deep retrieval methods. Semi-supervised: These methods will leverage rich, unlabeled datasets to improve the performance of models’ supervised learning.

As information becomes increasingly multimodal and heterogeneous, it will become vital to address the challenges in the field of cross-modal retrieval. This will help to close the gap between different data forms, boosting the accuracy and relevance of search results for human clients, while also allowing machines to understand the world in a more human-like way.

When applied in the real world, cross-modal retrieval can be leveraged for a wide range of use cases, like automatically generating accurate descriptions of various types of content. This enhances voice assistants’ capabilities to understand complex queries, or helps to establish more natural and intuitive human-computer interactions.

As cross-modal retrieval continues to evolve, issues like the heterogeneous modality gap, improving hierarchical semantic alignment and nonlinear correlation learning between different modalities will require more development, as well as enhancing user interfaces, privacy and security.

To dig deeper into the available research into cross-modal retrieval and a dizzying array of tools and datasets, you can check out this categorized list on GitHub, as well as this toolbox, which includes some open source repositories.

Incremental computation in data streaming means updating results as fresh data comes in, without redoing all calculations from the beginning. This met......

As a best-selling author, I invite you to explore my books on Amazon. Don't forget to follow me on Medium and show your support. Thank you! Your suppo......

When it comes to SEO, keyword scraping is a game-changer. It helps websites figure out what people are searching for and how to rank higher on Google.......

Materialized Views in Data Stream Processing With RisingWave

Materialized Views in Data Stream Processing With RisingWave

Incremental computation in data streaming means updating results as fresh data comes in, without redoing all calculations from the beginning. This method is essential for handling ever-changing information, like real-time sensor readings, social media streams, or stock market figures.

In a traditional, non-entrepreneurial calculation model, we need to process the entire dataset every time we get a new piece of data. It can be incompetent and slow. In incremental calculations, only the part of the result affected by new data is updated.

There are basically three steps involved in incremental computation in data streaming, where the first step starts with an initial computation, which could be the result of the first chunk of data or a default value. In the second step, processing of incoming data as new data streams in, the computation is incrementally updated. There might be adding new values, removing old values that are no longer relevant, and eventually updating intermediate results based on the changes. Finally, the third step is updating intermediate results based on the changes.

How Incremental Computation Benefits From Materialized Views (MV).

Materialized views have boosted incremental computation when it comes to data streaming. They adopted a procedure to enhance the retrieval and calculation of aggregated or pre-processed data across a flow of events or records. By keeping a pre-processed view of the data, materialized views help to improvement and query large, ever-changing datasets that result in faster and more efficient calculations. Ideally, in terms of the RDBMS concept, a materialized view is a database object that physically keeps the query result rather than recalculating it each time the query is executed.

However, in relation to data streaming, the materialized views can be thought of as preset sets or outcomes that are gradually updated as new data enters the system. Let’s drill down more functionality-wise.

It is the process of calculating and storing aggregate values (such as sums, averages, counts, min, max, etc.) in advance before they are actually needed for queries or analysis. It also can be termed a precomputed aggregate. The main objective of this process is to improve performance by avoiding redundant calculations.

Instead of repeatedly computing the aggregate values every time a query is made, the system retrieves the precomputed results, which are much faster. It typically works by starting with data collection as data is processed or streamed, and the system aggregates it incrementally. Next is the storage where the aggregated values are stored in a separate data structure. Eventually, upon queries of the data, the system retrieves the precomputed aggregate, which eliminates the need to compute the result from scratch.

With this approach, the system can efficiently handle more queries or data streams by avoiding repeating computations. Streaming applications and large-scale data processing can be beneficial since they enhance query performance by computing and storing summary information beforehand.

When it comes to data streaming, materialized views act like a cache for the query results so that the system does not need to repeatedly access the raw, potentially large stream data. By maintaining a precomputed result, the system ensures that the time-consuming processing steps only need to happen once, and subsequent updates to the view are relatively lightweight.

The efficiency of querying is being increased in the materialized views because of storing the results of frequently-used queries (aggregates, counts, averages, etc.). Without scanning the entire dataset every time, the system can instantly return the results. Besides, the materialized views allow more complex computations to be split up and computed into smaller, more manageable pieces.

With the arrival of new data in the system, there might be a chance of causing or affecting the parts of the materialized view. And interestingly, only the affected parts of the materialized view would be updated. This results in the system avoiding the heavy computational burden of full re-evaluations and can focus on modifying just the required aggregates.

Real-time data processing and decision-making are crucial functions in many streaming applications. The materialized views reduce the time needed to generate the latest results from the incoming datasets by precomputing and storing query results with optimized system resource utilization.

In enterprise-level systems, using materialized views that revision bit by bit helps handle huge amounts of data without straining resources. The system can work with and keep track of small changes, which means it can keep running even as the data gets bigger. Still, we will have to be careful where storage space is limited and economical.

Materialized views make it easier to handle computations when working with tricky logic or changes that need to happen to huge amounts of data. By keeping the results of complex searches or transformations, materialized views offload computation work from the database when new data is processed. This makes incremental computation easier to handle, as only mandatory changes are applied, and the complicated stuff can be distracted away from the main query.

Effective Utilization of Materialized Views in RisingWave.

RisingWave is a leading platform for event stream processing, designed to offer a straightforward and cost-effective solution for handling real-time streaming data. It supports a Postgres-compatible SQL interface and a DataFrame-style Python interface. It functions as a streaming database and utilizes materialized views in an innovative manner to enable continuous analytics and data transformations.

This approach is particularly effective for latency-sensitive applications like alerting, monitoring, and trading, ensuring real-time insights and responsiveness. It is designed to deal with streaming data, and materialized views are key to this. These views store pre-calculated query results, letting the system give fresh, current data without redoing entire datasets. As new data flows into the database, the system updates materialized views bit by bit. This displays changes without needing to redo all calculations, which allows for almost real-time analysis.

In RisingWave, we can create materialized views using either tables or streams through source objects. If storing raw stream data isn't necessary, we can simply select the required fields, apply transformations, and retain only the resulting data. Here, materialized views are refreshed automatically and incrementally each time a new event occurs, rather than being updated manually or on a fixed schedule. Once a materialized view is created, the RisingWave engine continuously monitors for relevant incoming events, ensuring efficient computations by processing only the new data. This design minimizes computing overhead.

Additionally, RisingWave offers snapshot read consistency, guaranteeing that queries run against materialized views or tables within the same database produce consistent results across them.

Besides, we can build complex, tree-structured transformations with layered materialized views in order to ensure consistent data at each level and avoid cascading failures or slow refresh rates. In stream processing applications, Apache Kafka often serves as a bridge to integrate various processing workflows.

(You can read here how to integrate Apache Kafka in KRaft Mode with RisingWave for Event Streaming Analytics). The distinct advantage of MV-on-MV lies in its ability to remove the need for intricate inter-system pipelines, enabling you to chain your transformation logic effortlessly.

Note: You can visit here to learn more about RisingWave with all other functionalities, including core product concepts with various source and sink connectors for data streams. In this write-up, I have emphasized leveraging materialized views as a backbond in RisingWave.

I hope you enjoyed reading this. If you found this article valuable, please consider liking and sharing it.

In the rapidly evolving landscape of software development, Application Programming Interfaces (APIs) have emerged as the fundamental building blocks o......

Google Cloud has revealed the public beta launch of Gen AI Toolbox for Databases, an open-source server developed in collaboration with LangChain. Th......

A much-needed disclaimer: You (kinda) can use functions now! I know, it isn’t the most pleasant feeling to finish reading about a new feature just for......

Meta Enhances Download Your Information Tool with Data Logs

Meta Enhances Download Your Information Tool with Data Logs

Meta has in the recent past introduced data logs as part of their Download Your Information (DYI) tool, enabling consumers to access additional data about their product usage. This development was aimed at enhancing transparency and user control over personal data.

A blog post on Meta’s engineering blog summarised the journey. The implementation of data logs presented challenges due to the scale of Meta's operations and the limitations of their data warehouse system, Hive. The primary challenge was the inefficiency of querying Hive tables, which are partitioned by date and time, requiring a scan of every row in every partition to retrieve data for a specific user. With over 3 billion monthly active consumers, this approach would process a humongous amount of irrelevant data for each query.

Meta developed a system that amortizes the cost of expensive full table scans by batching individual clients' requests into a single scan. This method provides sufficiently predictable performance characteristics to make the feature feasible with some levy of processing irrelevant data.

The current design utilizes Meta's internal task-scheduling service to organize recent requests for individuals' data logs into batches. These batches are submitted to a system built on the Core Workflow Service (CWS), which ensures reliable execution of long-running tasks. The process involves copying user IDs into a new Hive table, initiating worker tasks for each data logs table, and executing jobs in Dataswarm, Meta's data pipeline system.

The jobs perform an INNER JOIN between the table containing requesters' IDs and the column identifying the data owner in each table. This operation produces an intermediate Hive table containing combined data logs for all customers in the current batch. PySpark then processes this output to split it into individual files for each user's data in a given partition.

The resulting raw data logs are further processed using Meta's Hack language to apply privacy rules and filters, rendering the data into meaningful, well-explained HTML files. Finally, the results are aggregated into a ZIP file and made available through the DYI tool.

Throughout the development process, Meta learned some essential lessons. They found it essential to implement robust checkpointing mechanisms to enable incremental progress and resilience against errors and temporary failures. This leads to increasing overall system throughput by allowing work to resume piecemeal in case of issues like job timeouts or memory-related failures.

When ensuring data correctness, Meta encountered a Spark concurrency bug that could have led to data being returned to the wrong user. To address this, they implemented a verification step in the post-processing stage to ensure that the user ID in the data matches the identifier of the user whose logs are being generated.

The complexity of the data workflows required advanced tools and the ability to iterate on code changes quickly. Meta then built an experimentation platform that allows for running modified versions of workflows and independently executing phases of the process to create faster cycles of testing and development.

Hardik Khandelwal, Software Engineer III at Google appreciated the engineering principles behind data logs, mentioning,.

What stands out to me is how solid software engineering principles enabled this at scale: - Batching requests to efficiently query massive datasets. - Checkpointing to ensure incremental progress and fault tolerance. - Security checks to enforce privacy rules and prevent data leakage. This system was a massive engineering challenge—querying petabytes of data from Hive without overwhelming infrastructure.

Meta was also in the news in recent times as it revealed the Automated Compliance Hardening (ACH) tool, and open-sourced Large Concept Model (LCM). ACH tool is a mutation-guided, LLM-based test generation system. LCM is a language model designed to operate at a higher abstraction level than tokens.

Meta also emphasized the importance of making the data consistently understandable and explainable to end-consumers. This involves collaboration between access experts and specialist teams to review data tables, ensuring that sensitive information is not exposed and that internal technical jargon is translated into user-friendly terms.

Finally, the processed content is implemented in code using renderers that transform raw values into user-friendly representations. This includes converting numeric IDs into meaningful entity references, converting enum values into descriptive text, and removing technical terms.

The world of NPM is massive. Over 2 million packages, and yet, most developers end up using the same 20-30 over and over again.

This week's Java roundup for February 24th, 2025 functions news highlighting: JEP 502, Stable Values (Preview), Proposed to Target for JDK 25; mileston......

Most applications rely on cloud SDKs to connect to services like message brokers, queues, databases, APIs and more. This introduces deployment frictio......

Market Impact Analysis

Market Growth Trend

2018201920202021202220232024
7.5%9.0%9.4%10.5%11.0%11.4%11.5%
7.5%9.0%9.4%10.5%11.0%11.4%11.5% 2018201920202021202220232024

Quarterly Growth Rate

Q1 2024 Q2 2024 Q3 2024 Q4 2024
10.8% 11.1% 11.3% 11.5%
10.8% Q1 11.1% Q2 11.3% Q3 11.5% Q4

Market Segments and Growth Drivers

Segment Market Share Growth Rate
Enterprise Software38%10.8%
Cloud Services31%17.5%
Developer Tools14%9.3%
Security Software12%13.2%
Other Software5%7.5%
Enterprise Software38.0%Cloud Services31.0%Developer Tools14.0%Security Software12.0%Other Software5.0%

Technology Maturity Curve

Different technologies within the ecosystem are at varying stages of maturity:

Innovation Trigger Peak of Inflated Expectations Trough of Disillusionment Slope of Enlightenment Plateau of Productivity AI/ML Blockchain VR/AR Cloud Mobile

Competitive Landscape Analysis

Company Market Share
Microsoft22.6%
Oracle14.8%
SAP12.5%
Salesforce9.7%
Adobe8.3%

Future Outlook and Predictions

The Data Cross Modal landscape is evolving rapidly, driven by technological advancements, changing threat vectors, and shifting business requirements. Based on current trends and expert analyses, we can anticipate several significant developments across different time horizons:

Year-by-Year Technology Evolution

Based on current trajectory and expert analyses, we can project the following development timeline:

2024Early adopters begin implementing specialized solutions with measurable results
2025Industry standards emerging to facilitate broader adoption and integration
2026Mainstream adoption begins as technical barriers are addressed
2027Integration with adjacent technologies creates new capabilities
2028Business models transform as capabilities mature
2029Technology becomes embedded in core infrastructure and processes
2030New paradigms emerge as the technology reaches full maturity

Technology Maturity Curve

Different technologies within the ecosystem are at varying stages of maturity, influencing adoption timelines and investment priorities:

Time / Development Stage Adoption / Maturity Innovation Early Adoption Growth Maturity Decline/Legacy Emerging Tech Current Focus Established Tech Mature Solutions (Interactive diagram available in full report)

Innovation Trigger

  • Generative AI for specialized domains
  • Blockchain for supply chain verification

Peak of Inflated Expectations

  • Digital twins for business processes
  • Quantum-resistant cryptography

Trough of Disillusionment

  • Consumer AR/VR applications
  • General-purpose blockchain

Slope of Enlightenment

  • AI-driven analytics
  • Edge computing

Plateau of Productivity

  • Cloud infrastructure
  • Mobile applications

Technology Evolution Timeline

1-2 Years
  • Technology adoption accelerating across industries
  • digital transformation initiatives becoming mainstream
3-5 Years
  • Significant transformation of business processes through advanced technologies
  • new digital business models emerging
5+ Years
  • Fundamental shifts in how technology integrates with business and society
  • emergence of new technology paradigms

Expert Perspectives

Leading experts in the software dev sector provide diverse perspectives on how the landscape will evolve over the coming years:

"Technology transformation will continue to accelerate, creating both challenges and opportunities."

— Industry Expert

"Organizations must balance innovation with practical implementation to achieve meaningful results."

— Technology Analyst

"The most successful adopters will focus on business outcomes rather than technology for its own sake."

— Research Director

Areas of Expert Consensus

  • Acceleration of Innovation: The pace of technological evolution will continue to increase
  • Practical Integration: Focus will shift from proof-of-concept to operational deployment
  • Human-Technology Partnership: Most effective implementations will optimize human-machine collaboration
  • Regulatory Influence: Regulatory frameworks will increasingly shape technology development

Short-Term Outlook (1-2 Years)

In the immediate future, organizations will focus on implementing and optimizing currently available technologies to address pressing software dev challenges:

  • Technology adoption accelerating across industries
  • digital transformation initiatives becoming mainstream

These developments will be characterized by incremental improvements to existing frameworks rather than revolutionary changes, with emphasis on practical deployment and measurable outcomes.

Mid-Term Outlook (3-5 Years)

As technologies mature and organizations adapt, more substantial transformations will emerge in how security is approached and implemented:

  • Significant transformation of business processes through advanced technologies
  • new digital business models emerging

This period will see significant changes in security architecture and operational models, with increasing automation and integration between previously siloed security functions. Organizations will shift from reactive to proactive security postures.

Long-Term Outlook (5+ Years)

Looking further ahead, more fundamental shifts will reshape how cybersecurity is conceptualized and implemented across digital ecosystems:

  • Fundamental shifts in how technology integrates with business and society
  • emergence of new technology paradigms

These long-term developments will likely require significant technical breakthroughs, new regulatory frameworks, and evolution in how organizations approach security as a fundamental business function rather than a technical discipline.

Key Risk Factors and Uncertainties

Several critical factors could significantly impact the trajectory of software dev evolution:

Technical debt accumulation
Security integration challenges
Maintaining code quality

Organizations should monitor these factors closely and develop contingency strategies to mitigate potential negative impacts on technology implementation timelines.

Alternative Future Scenarios

The evolution of technology can follow different paths depending on various factors including regulatory developments, investment trends, technological breakthroughs, and market adoption. We analyze three potential scenarios:

Optimistic Scenario

Rapid adoption of advanced technologies with significant business impact

Key Drivers: Supportive regulatory environment, significant research breakthroughs, strong market incentives, and rapid user adoption.

Probability: 25-30%

Base Case Scenario

Measured implementation with incremental improvements

Key Drivers: Balanced regulatory approach, steady technological progress, and selective implementation based on clear ROI.

Probability: 50-60%

Conservative Scenario

Technical and organizational barriers limiting effective adoption

Key Drivers: Restrictive regulations, technical limitations, implementation challenges, and risk-averse organizational cultures.

Probability: 15-20%

Scenario Comparison Matrix

FactorOptimisticBase CaseConservative
Implementation TimelineAcceleratedSteadyDelayed
Market AdoptionWidespreadSelectiveLimited
Technology EvolutionRapidProgressiveIncremental
Regulatory EnvironmentSupportiveBalancedRestrictive
Business ImpactTransformativeSignificantModest

Transformational Impact

Technology becoming increasingly embedded in all aspects of business operations. This evolution will necessitate significant changes in organizational structures, talent development, and strategic planning processes.

The convergence of multiple technological trends—including artificial intelligence, quantum computing, and ubiquitous connectivity—will create both unprecedented security challenges and innovative defensive capabilities.

Implementation Challenges

Technical complexity and organizational readiness remain key challenges. Organizations will need to develop comprehensive change management strategies to successfully navigate these transitions.

Regulatory uncertainty, particularly around emerging technologies like AI in security applications, will require flexible security architectures that can adapt to evolving compliance requirements.

Key Innovations to Watch

Artificial intelligence, distributed systems, and automation technologies leading innovation. Organizations should monitor these developments closely to maintain competitive advantages and effective security postures.

Strategic investments in research partnerships, technology pilots, and talent development will position forward-thinking organizations to leverage these innovations early in their development cycle.

Technical Glossary

Key technical terms and definitions to help understand the technologies discussed in this article.

Understanding the following technical concepts is essential for grasping the full implications of the security threats and defensive measures discussed in this article. These definitions provide context for both technical and non-technical readers.

Filter by difficulty:

interface intermediate

algorithm Well-designed interfaces abstract underlying complexity while providing clearly defined methods for interaction between different system components.

platform intermediate

interface Platforms provide standardized environments that reduce development complexity and enable ecosystem growth through shared functionality and integration capabilities.

API beginner

platform APIs serve as the connective tissue in modern software architectures, enabling different applications and services to communicate and share data according to defined protocols and data formats.
API concept visualizationHow APIs enable communication between different software systems
Example: Cloud service providers like AWS, Google Cloud, and Azure offer extensive APIs that allow organizations to programmatically provision and manage infrastructure and services.

algorithm intermediate

encryption

framework intermediate

API